Paper 2023/1010
End-to-end Privacy Preserving Training and Inference for Air Pollution Forecasting with Data from Rival Fleets
Abstract
Privacy-preserving machine learning (PPML) promises to train machine learning (ML) models by combining data spread across multiple data silos. Theoretically, secure multiparty computation (MPC) allows multiple data owners to train models on their joint data without revealing the data to each other. However, the prior implementations of this secure training using MPC have three limitations: they have only been evaluated on CNNs, and LSTMs have been ignored; fixed point approximations have affected training accuracies compared to training in floating point; and due to significant latency overheads of secure training via MPC, its relevance for practical tasks with streaming data remains unclear. The motivation of this work is to report our experience of addressing the practical problem of secure training and inference of models for urban sensing problems, e.g., traffic congestion estimation, or air pollution monitoring in large cities, where data can be contributed by rival fleet companies while balancing the privacy-accuracy trade-offs using MPC-based techniques. Our first contribution is to design a custom ML model for this task that can be efficiently trained with MPC within a desirable latency. In particular, we design a GCN-LSTM and securely train it on time-series sensor data for accurate forecasting, within 7 minutes per epoch. As our second contribution, we build an end-toend system of private training and inference that provably matches the training accuracy of cleartext ML training. This work is the first to securely train a model with LSTM cells. Third, this trained model is kept secret-shared between the fleet companies and allows clients to make sensitive queries to this model while carefully handling potentially invalid queries. Our custom protocols allow clients to query predictions from privately trained models in milliseconds, all the while maintaining accuracy and cryptographic security
Metadata
- Available format(s)
- Category
- Applications
- Publication info
- Published elsewhere. Privacy Enhancing Technologies Symposium 2023
- Keywords
- MPCpollutionmachine learningtraining
- Contact author(s)
-
gauri @ mit edu
kramesh3 @ jh edu
t-anweshb @ microsoft com
divya gupta @ microsoft com
rahsha @ microsoft com
nichandr @ microsoft com
riju @ cse iitd ac in - History
- 2023-07-04: last of 2 revisions
- 2023-06-29: received
- See all versions
- Short URL
- https://ia.cr/2023/1010
- License
-
CC BY
BibTeX
@misc{cryptoeprint:2023/1010, author = {Gauri Gupta and Krithika Ramesh and Anwesh Bhattacharya and Divya Gupta and Rahul Sharma and Nishanth Chandran and Rijurekha Sen}, title = {End-to-end Privacy Preserving Training and Inference for Air Pollution Forecasting with Data from Rival Fleets}, howpublished = {Cryptology {ePrint} Archive, Paper 2023/1010}, year = {2023}, url = {https://eprint.iacr.org/2023/1010} }