Private Multi-Party Machine Learning (NIPS 2016 Workshop)

Schedule

8.45	Welcome and Introduction
9.00	Mariana Raykova — Secure Computation: Why, How and When [slides]
The goal of secure computation is to facilitate the evaluation of functionalities that depends on the private inputs of several distrusting parties in a privacy preserving manner which minimizes the information revealed about the inputs. In this talk we will introduce example problems motivating the work in the area of secure computation including problems related to machine learning. We will discuss how we formalize the notion of privacy in cryptographic protocols and how we prove privacy preserving properties for secure computation constructions. We will provide an overview of some main techniques and constructions for secure computation including Yao garbled circuits, approaches based an secret sharing and others. Lastly we will cover the different efficiency measures relevant for the practical use of secure computation protocols.
09.45	Stratis Ioannidis — Secure Function Evaluation at Scale [slides]
Secure Function Evaluation (SFE) allows an interested party to evaluate a function over private data without learning anything about the inputs other than the outcome of this computation. This offers a strong privacy guarantee: SFE enables, e.g., a medical researcher, a statistician, or a data analyst, to conduct a study over private, sensitive data, without jeopardizing the privacy of the study's participants (patients, online users, etc.). Nevertheless, applying SFE to “big data” poses several challenges. First, beyond any computational overheads due to encryptions and decryptions, executing an algorithm securely may lead to a polynomial blowup in the total work (i.e., number of computation steps) compared to execution in the clear. For large datasets, even going from linear to quadratic time is prohibitive. Second, secure evaluations of algorithms should also maintain parallelizability: an algorithm that is easy to parallelize in the clear should also maintain this property in its SFE version, if its execution is to scale. Addressing this is challenging as communication patterns between processors often reveal a lot about the private inputs. In this talk, we show that several machine learning and data mining algorithms can be executed securely while leading to only (a) a logarithmic blow-up in total work and (b) a logarithmic increase in the execution time when executed in parallel. Our techniques generalize to any algorithm that is graph-parallel, i.e., can be expressed through a sequence of scatter, gather, and apply operations. This includes several algorithms of great practical interest, including page rank, matrix factorization, and training neural networks, to name a few.
10.30	Coffee Break
11.00	Jack Doerner — An Introduction to Practical MPC [slides]
The field of Secure Multiparty Computation (MPC) has seen an explosion of research over the past few years: though once a mostly theoretical idea, it has rapidly become a powerful, practical tool capable of efficiently solving real-world problems. However, this has come at the cost of dramatically increased complexity, expressed through a diversity of foundational techniques, high-level systems, and implementation folk knowledge. This talk will address the practical aspects of multiparty computation, discussing a number of low level paradigms and their accompanying implementations, along with the various efficiency, functionality, and usability compromises that they offer. In addition, it will serve as an introduction to a set of tools and techniques that are commonly used in conjunction with generic MPC schemes, such as Oblivious RAM, permutation networks, and custom protocols. It is hoped that this will serve as a jumping-off-point, from which new problems can be addressed.
11.30	AnonML: Anonymous Machine Learning Over a Network of Data Holders (contributed talk)
Centralized data warehouses can be disadvantageous to users for many reasons, including privacy, security, and control. We propose AnonML, a system for anonymous, peer-to-peer machine learning. At a high level, AnonML functions by moving as much computation as possible to its end users, away from a central authority. AnonML users store and compute features on their own data, thereby limiting the amount of information they need to share. To generate a model, a group of data-holding peers first agree on a model definition, a set of feature functions, and an aggregator, a peer who temporarily acts as a central authority. Each peer anonymously sends several small packets of labeled feature data to the aggregator. In exchange, the aggregator generates a classifier and shares it with the group. In this way, AnonML data holders control what information they share on a feature-by-feature and model-by-model basis, and peers are able to disassociate features from their digital identities. Additionally, each peer can generate models suited to their particular needs, and the whole network benefits from the creation of novel, useful models.
11.50	Private Topic Modeling (contributed talk)
We develop a privatised stochastic variational inference method for Latent Dirichlet Allocation (LDA). The iterative nature of stochastic variational inference presents challenges: multiple iterations are required to obtain accurate posterior distributions, yet each iteration increases the amount of noise that must be added to achieve a reasonable degree of privacy. We propose a practical algorithm that overcomes this challenge by combining: (1) A relaxed notion of the differential privacy, called concentrated differential privacy, which provides high probability bounds for cumulative privacy loss, which is well suited for iterative algorithms, rather than focusing on single-query loss; and (2) privacy amplification resulting from subsampling of large-scale data. Focusing on conjugate exponential family models, in our private variational inference, all the posterior distributions will be privatised by simply perturbing expected sufficient statistics. Using Wikipedia data, we illustrate the effectiveness of our algorithm for large-scale data.
12.15	Poster Spotlights
13.00	Lunch Break
14.30	Practical Secure Aggregation for Federated Learning on User-Held Data (contributed talk)
Secure Aggregation is a class of Secure Multi-Party Computation algorithms wherein a group of mutually distrustful parties u ∈ U each hold a private value x_u and collaborate to compute an aggregate value, such as the sum P = SUM(x_u, u∈U), without revealing to one another any information about their private value except what is learnable from the aggregate value itself. In this work, we consider training a deep neural network in the Federated Learning model, using distributed gradient descent across user-held training data on mobile devices, using Secure Aggregation to protect the privacy of each user’s model gradient. We identify a combination of efficiency and robustness requirements which, to the best of our knowledge, are unmet by existing algorithms in the literature. We proceed to design a novel, communication-efficient Secure Aggregation protocol for high-dimensional data that tolerates up to 1/3 of users failing to complete the protocol. For 16-bit input values, our protocol offers 1.73× communication expansion for 2^10 users and 2^20-dimensional vectors, and 1.98× expansion for 2^14 users and 2^24-dimensional vectors.
15.00	Coffee Break
15.30	Poster Session
16.30	Richard Nock — Confidential Computing - Federate Private Data Analysis [slides]
TBD
17.15	Dawn Song — Lessons and Open Challenges in Secure and Privacy-Preserving Machine Learning and Data Analytics
TBD
18.00	Wrap Up

Accepted Papers

Peter Kairouz, Sewoong Oh, Pramod Viswanath
Differentially Private Multi-party Computation

We study the problem of multi-party computation under approximate (ε, δ) differential privacy. We assume an interactive setting with k parties, each possessing a private bit. Each party wants to compute a function defined on all the parties’ bits. Differential privacy ensures that there remains uncertainty in any party’s bit even when given the transcript of interactions and all the other parties’ bits. This paper is a follow up to our work in [9], where we studied multi-party computation under (ε, 0) differential privacy. We generalize the results in [9] and prove that a simple non-interactive randomized response mechanism is optimal. Our optimality result holds for all privacy levels (all values of ε and δ), heterogenous privacy levels across parties, all types of functions to be computed, all types of cost metrics, and both average and worst-case (over the inputs) measures of accuracy.

Martine De Cock, Rafael Dowsley, Anderson C. A. Nascimento and Stacey C. Newman
Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data

We propose a protocol for performing linear regression over a dataset that is distributed over multiple parties. The parties jointly compute a linear regression model without actually revealing their own datasets. Our solution is information-theoretically secure and is based on the assumption that a trusted initializer pre-distributes correlated randomness to the parties during a setup phase. The actual computation happens during an online phase and does not involve the trusted initializer. Our online protocol is orders of magnitude faster than previous solutions.

Mijung Park, James Foulds, Kamalika Chaudhuri, Max Welling
Private Topic Modeling [PDF]

We develop a privatised stochastic variational inference method for Latent Dirichlet Allocation (LDA). The iterative nature of stochastic variational inference presents challenges: multiple iterations are required to obtain accurate posterior distributions, yet each iteration increases the amount of noise that must be added to achieve a reasonable degree of privacy. We propose a practical algorithm that overcomes this challenge by combining: (1) A relaxed notion of the differential privacy, called concentrated differential privacy, which provides high probability bounds for cumulative privacy loss, which is well suited for iterative algorithms, rather than focusing on single-query loss; and (2) privacy amplification resulting from subsampling of large-scale data. Focusing on conjugate exponential family models, in our private variational inference, all the posterior distributions will be privatised by simply perturbing expected sufficient statistics. Using Wikipedia data, we illustrate the effectiveness of our algorithm for large-scale data.

Michael Smith, Max Zwiessele, Neil Lawrence
Differentially Private Gaussian Processes [PDF]

Differential privacy allows algorithms to have provable privacy guarantees. Gaussian processes are a widely used approach for dealing with uncertainty in functions. This paper explores differentially private mechanisms for Gaussian processes. We compare adding noise both pre- and post-regression. For the former we develop a new kernel for use with binned data. For the latter we show that using inducing inputs allows us to reduce the noise scale. For the datasets used, the two strategies have comparable accuracy. Together these methods provide a starter toolkit for combining differential privacy and Gaussian processes.

Christina Heinze-Deml, Brian McWilliams, Nicolai Meinshausen
Preserving Differential Privacy Between Features in Distributed Estimation

Privacy is crucial in many applications of machine learning. Legal, ethical and societal issues restrict the sharing of sensitive data making it difficult to learn from datasets that are partitioned between many parties. The differential privacy framework guarantees preserving anonymity in a large dataset. However, in the distributed setting very few approaches exist for private data sharing. To this end, we propose PriDE, a scalable framework for distributed estimation where each party communicates perturbed sketches of their locally held features ensuring differentially private data sharing. For L2 penalized supervised learning problems PriDE has bounded estimation error compared with the optimal estimates obtained without privacy constraints in the non-distributed setting.

Beyza Ermis, Taylan Cemgil
Data Sharing via Differentially Private Coupled Tensor Factorization

We develop a learning mechanism that protects the privacy of individuals in a distributed setting, in which 'N' data sites jointly estimate parameters of a statistical model conditioned on all the data. Unlike the classical asymmetric curator/analyst scenario, here, each data site is both a data provider and a data consumer. The sites want to maximize the utility of the released data while providing privacy guarantees for participating individuals. A natural statistical model for this distributed scenario is coupled tensor factorization. We use a novel connection between differential privacy and sampling from a Bayesian posterior via Stochastic Gradient Langevin Dynamics (SGLD) to derive an efficient and privacy preserving coupled tensor factorization algorithm. We demonstrate that the proposed method is able to provide good prediction accuracy on synthetic and real datasets while providing both site-level and user-level differential privacy.

Morten Dahl, Valerio Pastro, Mathieu Poumeyrol
Private Data Aggregation on a Budget [PDF]

We propose a practical solution to performing simple cross-user machine learning on a sensitive dataset distributed among a set of users with privacy concerns. We focus on a scenario in which a single company wishes to obtain the distribution of aggregate features, while ensuring a high level of privacy for the users. We are interested in the case where users own devices that are not necessarily powerful or online at all times, like smartphones or web browsers. This premise makes general solutions, such as general multiparty computation (MPC), less applicable. We design an efficient special-purpose MPC protocol that outputs aggregate features to the company, while keeping online presence and computational complexity on the users' side at a minimum. This basic protocol is secure against a majority of corrupt users, as long as they do not collude with the company. If they do, we still guarantee security, as long as the fraction of corrupt users is lower than a certain, tweakable, parameter. We propose different enhancements of this solution: one guaranteeing some degree of active security, and one that additionally ensures differential privacy (DP). Finally, we report on the performance of our implementation of the above solutions.

Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, Karn Seth
Practical Secure Aggregation for Federated Learning on User-Held Data [PDF]

Secure Aggregation is a class of Secure Multi-Party Computation algorithms wherein a group of mutually distrustful parties u ∈ U each hold a private value x_u and collaborate to compute an aggregate value, such as the sum P = SUM(x_u, u∈U), without revealing to one another any information about their private value except what is learnable from the aggregate value itself. In this work, we consider training a deep neural network in the Federated Learning model, using distributed gradient descent across user-held training data on mobile devices, using Secure Aggregation to protect the privacy of each user’s model gradient. We identify a combination of efficiency and robustness requirements which, to the best of our knowledge, are unmet by existing algorithms in the literature. We proceed to design a novel, communication-efficient Secure Aggregation protocol for high-dimensional data that tolerates up to 1/3 of users failing to complete the protocol. For 16-bit input values, our protocol offers 1.73× communication expansion for 2^10 users and 2^20-dimensional vectors, and 1.98× expansion for 2^14 users and 2^24-dimensional vectors.

Stephen Hardy, Wilko Henecka, Richard Nock
On Private Supervised Distributed Learning: Weakly Labeled and without Entity Resolution [PDF]

We describe a system with strong privacy guarantees that is able to learn (supervised) linear classifiers when the data is distributed and not all parties have labels. The privacy guarantees are due to the use of Rademacher observations (rados) for learning, where these rados are calculated using only the functionality provided by {\it partially} homomorphic encryption, which obscures the source data and provides realistic learning times. Furthermore, differentially private Rados may be calculated, leading to a distributed machine learning algorithm where all data remains private to the contributors, and the resulting learnt model cannot be used to reconstruct any of the data. Finally, we demonstrate that these models can be learnt when the datasets only share some categorical features, and the correspondences between entities in the datasets are {\it not} known. We illustrate the performance of the model with a synthetic dataset.

Igor Colin, Christophe Dupuy
Decentralized Topic Modelling with Latent Dirichlet Allocation [PDF]

Privacy preserving networks can be modelled as decentralized networks (e.g., sensors, connected objects, smartphones), where communication between nodes of the network is not controlled by an all-knowing, central node. For this type of networks, the main issue is to gather/learn global information on the network (e.g., by optimizing a global cost function) while keeping the (sensitive) information at each node. In this work, we focus on text information that agents do not want to share (e.g., text messages, emails, confidential reports). We use recent advances on decentralized optimization and topic models to infer topics from a graph with limited communication. We propose a method to adapt latent Dirichlet allocation (LDA) model to decentralized optimization and show on synthetic data that we still recover similar parameters and similar performance at each node than with stochastic methods accessing to the whole information in the graph.

Nicolas Papernot, Ulfar Erlingsson, Martin Abadi, Kunal Talwar, Ian Goodfellow
Machine Learning with Privacy by Knowledge Aggregation and Transfer [PDF]

Machine learning relies on the availability of high-quality training data and---whether by its inherent nature, or by accident---this data will sometimes contain private information. When the model is to be published or made publicly accessible and the training data is not, it is important that the details of the sensitive training data cannot be inadvertently revealed by the model. This abstract presents a generally applicable approach to providing strong privacy guarantees for machine learning training data. The approach is based on combining, in a black-box fashion, multiple machine learning models trained with disjoint sensitive datasets, such as data for different users. Because they rely directly on sensitive data, these models are used only as ``teachers'' for a ``student'' machine learning model. However, when training the student, the teachers transfer only the labels upon which they all generally agree, via a noisy aggregation mechanism. The student has privacy properties that can be understood both intuitively (since no single teacher dictates the student's training) and formally, in terms of differential privacy. These properties address ``glass-box'' attacks of the kind that arise if an adversary not only queries the student but also inspects its internal workings. The approach imposes only weak assumptions on how the teachers are trained. It applies to powerful, deep models, possibly with many layers and parameters. Our experiments demonstrate that the approach applies to real-world machine learning tasks, at a reasonable cost in accuracy, privacy, and software complexity.

Hiromi Arai, Hiroshi Nakagawa
Privacy risk analysis in learning from distributed data

Machine learning is one of the most powerful approaches to discover knowledge from data. A large sample size will improve the performance of the learned models. On the other hand, data is often distributed privately to each site due to privacy reasons. Data anonymization is often used for privacy preservation in data sharing. However, anonymization decreases data utility while the privacy remains at risk. As an alternative approach for learning models from distributed private data, we focus on ensemble learning techniques. By using ensemble learning techniques, we can obtain a learned model based on all data by sharing only learned models from local sites.We compared privacy and utility of an anonymized data sharing and a learned model sharing. The utility is examined by the performance of the classifier made from shared anonymized datasets and that from shared classifiers. To examine privacy, we applied homogeneity attacks and model inversion. We evaluate privacy and utility using a real-world dataset and show that superiority of sharing learned models.

Hassan Takabi, Ehsan Hesamifard, Mehdi Ghasemi
Privacy Preserving Multi-party Machine Learning with Homomorphic Encryption [PDF]

Privacy preserving multi-party machine learning approaches enable multiple parties to train a machine learning model from aggregate data while ensuring the privacy of their individual datasets is preserved. In this paper, we propose a privacy preserving multi-party machine learning approach based on homomorphic encryption where the machine learning algorithm of choice is deep neural networks. We develop theoretical foundation for implementing deep neural networks over encrypted data and utilize it in developing efficient and practical algorithms in encrypted domain.

Lu Tian, Bargav Jayaraman, Quanquan Gu, David Evans
Aggregating Private Sparse Learning Models Using Multi-Party Computation [PDF]

We consider the problem of privately learning a sparse model across multiple sensitive datasets, and develop an approach where individual models are privately aggregated using secure multi-party computation to produce a joint model. We report some preliminary experiments on distributed sparse linear discriminant analysis, showing both the feasibility and effectiveness of our approach on experiments using heart disease data collected across four hospitals.

Tariq Elahi, Ryan Henry
Privacy-preserving Anomaly Detection in Tor [PDF]

This extended abstract presents our vision of PrivEy, a distributed data collection and anomaly detection framework for the Tor network. PrivEy builds on the general framework of PrivEx(CCS 2014), a system for privately collecting statistics about traffic egressing the Tor network; however,PrivEy extends PrivEx in several important respects: (i) it supports the collection of a wider array of data from a wider array of vantage points within the Tor network, and (ii) beyond merely producing differentially private summary statistics about the collected data, it can also use those data to continuously train ensemble classifiers with which to recognize anomalous patterns indicative of ongoing attacks against the Tor network and its users.

Pierre Dellenbach, Jan Ramon, Aurélien Bellet
A Decentralized and Robust Protocol for Private Averaging over Highly Distributed Data

We propose a decentralized protocol for a large set of users to privately compute averages over their joint data, which can later be used to learn more complex models. Our protocol can find a solution of arbitrary accuracy, does not rely on a trusted third party and preserves the privacy of users throughout the execution in both the honest-but-curious and malicious adversary models. Furthermore, we design a verification procedure which offers protection against malicious users joining the service with the goal of manipulating the outcome of the algorithm.

Phillipp Schoppmann, Adria Gascon, Mariana Raykova, David Evans, Samee Zahur, Jack Doerner, Borja Balle
Secure Distributed Linear Regression

We present a protocol for secure computation of linear regression models on vertically distributed datasets. It consists of two phases that build on commodity-based cryptography and garbled circuits, respectively, and provides security in the semi-honest threat model. For the second phase, three algorithms are presented and analyzed in terms of strengths and weaknesses. Evaluation of a prototypical implementation on artificial training data shows that linear regression models with a million samples and 100 features can be computed securely in less than half an hour on commodity hardware. Additionally, for even larger input datasets, a tradeoff between accuracy and computation time is identified and discussed. By approximating the result using the presented custom implementation of the iterative Conjugate Gradient Descent algorithm, a significant speedup can be achieved at a moderate price in terms of accuracy.

Bennett Cyphers, Kalyan Veeramachaneni
AnonML: Anonymous Machine Learning Over a Network of Data Holders

Centralized data warehouses can be disadvantageous to users for many reasons, including privacy, security, and control. We propose AnonML, a system for anonymous, peer-to-peer machine learning. At a high level, AnonML functions by moving as much computation as possible to its end users, away from a central authority. AnonML users store and compute features on their own data, thereby limiting the amount of information they need to share. To generate a model, a group of data-holding peers first agree on a model definition, a set of feature functions, and an aggregator, a peer who temporarily acts as a central authority. Each peer anonymously sends several small packets of labeled feature data to the aggregator. In exchange, the aggregator generates a classifier and shares it with the group. In this way, AnonML data holders control what information they share on a feature-by-feature and model-by-model basis, and peers are able to disassociate features from their digital identities. Additionally, each peer can generate models suited to their particular needs, and the whole network benefits from the creation of novel, useful models.

Jakub Konečný, H. Brendan Mcmahan, Felix Yu, Peter Richtárik, Ananda Theertha Suresh, Dave Bacon
Federated Learning: Strategies for Improving Communication Efficiency [PDF]

Federated Learning is a machine learning setting where the goal is to train a high-quality centralized model with training data distributed over a large number of clients each with unreliable and relatively slow network connections. We consider learning algorithms for this setting where on each round, each client independently computes an update to the current model based on its local data, and communicates this update to a central server, where the client-side updates are aggregated to compute a new global model. The typical clients in this setting are mobile phones, and communication efficiency is of utmost importance. In this paper, we propose two ways to reduce the uplink communication costs. The proposed methods are evaluated on the application of training a deep neural network to perform image classification. Our best approach reduces the upload communication required to train a reasonable model by two orders of magnitude.

Private Multi‑Party Machine Learning

Scope

Invited Speakers

Schedule

Accepted Papers

EPSRC Travel Grants

Call For Papers & Important Dates

Organization

Workshop organizers

Program Committee

Sponsors