Private Multi‑Party Machine Learning

NIPS 2016 Workshop
Barcelona, December 9

Centre de Convencions Internacional Barcelona
Room 131-132

News: Slides from invited talks are now available!

Scope

The one-day workshop focuses on the problem of privacy-preserving machine learning in scenarios where sensitive datasets are distributed across multiple data owners. Such distributed scenarios occur quite often in practice, for example when different parties contribute different records to a dataset, or information about each record in the dataset is held by different data owners.

Different communities have developed approaches to deal with this problem, including differential privacy-like techniques where noisy sketches are exchanged between the parties, homomorphic encryption where operations are performed on encrypted data, and tailored approaches using techniques from the field of secure multi-party computation. The workshop will serve as a forum to unify different perspectives on this problem, explore the relative merits of each approach, and investigate future lines of research.

The workshop will have a particular emphasis in the decentralization aspect of privacy-preserving machine learning. This includes a large number of realistic scenarios where the classical setup of differential privacy with a 'trusted curator' that prepares the data cannot be directly applied. The problem of privacy-preserving computation gains relevance in this model, and effectively leveraging the tools developed by the cryptographic community to develop private multi-party learning algorithms poses a remarkable challenge.

Invited Speakers

  • Jack Doerner (Virginia)
  • Stratis Ioannidis (Northeastern)
  • Richard Nock (Australian National University and University of Sydney)
  • Mariana Raykova (Yale)
  • Dawn Song (UC Berkeley)

Schedule

8.45 Welcome and Introduction
9.00 Mariana Raykova — Secure Computation: Why, How and When   [slides]
The goal of secure computation is to facilitate the evaluation of functionalities that depends on the private inputs of several distrusting parties in a privacy preserving manner which minimizes the information revealed about the inputs. In this talk we will introduce example problems motivating the work in the area of secure computation including problems related to machine learning. We will discuss how we formalize the notion of privacy in cryptographic protocols and how we prove privacy preserving properties for secure computation constructions. We will provide an overview of some main techniques and constructions for secure computation including Yao garbled circuits, approaches based an secret sharing and others. Lastly we will cover the different efficiency measures relevant for the practical use of secure computation protocols.
09.45 Stratis Ioannidis — Secure Function Evaluation at Scale   [slides]
Secure Function Evaluation (SFE) allows an interested party to evaluate a function over private data without learning anything about the inputs other than the outcome of this computation. This offers a strong privacy guarantee: SFE enables, e.g., a medical researcher, a statistician, or a data analyst, to conduct a study over private, sensitive data, without jeopardizing the privacy of the study's participants (patients, online users, etc.). Nevertheless, applying SFE to “big data” poses several challenges. First, beyond any computational overheads due to encryptions and decryptions, executing an algorithm securely may lead to a polynomial blowup in the total work (i.e., number of computation steps) compared to execution in the clear. For large datasets, even going from linear to quadratic time is prohibitive. Second, secure evaluations of algorithms should also maintain parallelizability: an algorithm that is easy to parallelize in the clear should also maintain this property in its SFE version, if its execution is to scale. Addressing this is challenging as communication patterns between processors often reveal a lot about the private inputs. In this talk, we show that several machine learning and data mining algorithms can be executed securely while leading to only (a) a logarithmic blow-up in total work and (b) a logarithmic increase in the execution time when executed in parallel. Our techniques generalize to any algorithm that is graph-parallel, i.e., can be expressed through a sequence of scatter, gather, and apply operations. This includes several algorithms of great practical interest, including page rank, matrix factorization, and training neural networks, to name a few.
10.30 Coffee Break
11.00 Jack Doerner — An Introduction to Practical MPC   [slides]
The field of Secure Multiparty Computation (MPC) has seen an explosion of research over the past few years: though once a mostly theoretical idea, it has rapidly become a powerful, practical tool capable of efficiently solving real-world problems. However, this has come at the cost of dramatically increased complexity, expressed through a diversity of foundational techniques, high-level systems, and implementation folk knowledge. This talk will address the practical aspects of multiparty computation, discussing a number of low level paradigms and their accompanying implementations, along with the various efficiency, functionality, and usability compromises that they offer. In addition, it will serve as an introduction to a set of tools and techniques that are commonly used in conjunction with generic MPC schemes, such as Oblivious RAM, permutation networks, and custom protocols. It is hoped that this will serve as a jumping-off-point, from which new problems can be addressed.
11.30 AnonML: Anonymous Machine Learning Over a Network of Data Holders (contributed talk)
Centralized data warehouses can be disadvantageous to users for many reasons, including privacy, security, and control. We propose AnonML, a system for anonymous, peer-to-peer machine learning. At a high level, AnonML functions by moving as much computation as possible to its end users, away from a central authority. AnonML users store and compute features on their own data, thereby limiting the amount of information they need to share. To generate a model, a group of data-holding peers first agree on a model definition, a set of feature functions, and an aggregator, a peer who temporarily acts as a central authority. Each peer anonymously sends several small packets of labeled feature data to the aggregator. In exchange, the aggregator generates a classifier and shares it with the group. In this way, AnonML data holders control what information they share on a feature-by-feature and model-by-model basis, and peers are able to disassociate features from their digital identities. Additionally, each peer can generate models suited to their particular needs, and the whole network benefits from the creation of novel, useful models.
11.50 Private Topic Modeling (contributed talk)
We develop a privatised stochastic variational inference method for Latent Dirichlet Allocation (LDA). The iterative nature of stochastic variational inference presents challenges: multiple iterations are required to obtain accurate posterior distributions, yet each iteration increases the amount of noise that must be added to achieve a reasonable degree of privacy. We propose a practical algorithm that overcomes this challenge by combining: (1) A relaxed notion of the differential privacy, called concentrated differential privacy, which provides high probability bounds for cumulative privacy loss, which is well suited for iterative algorithms, rather than focusing on single-query loss; and (2) privacy amplification resulting from subsampling of large-scale data. Focusing on conjugate exponential family models, in our private variational inference, all the posterior distributions will be privatised by simply perturbing expected sufficient statistics. Using Wikipedia data, we illustrate the effectiveness of our algorithm for large-scale data.
12.15 Poster Spotlights
13.00 Lunch Break
14.30 Practical Secure Aggregation for Federated Learning on User-Held Data (contributed talk)
Secure Aggregation is a class of Secure Multi-Party Computation algorithms wherein a group of mutually distrustful parties u ∈ U each hold a private value x_u and collaborate to compute an aggregate value, such as the sum P = SUM(x_u, u∈U), without revealing to one another any information about their private value except what is learnable from the aggregate value itself. In this work, we consider training a deep neural network in the Federated Learning model, using distributed gradient descent across user-held training data on mobile devices, using Secure Aggregation to protect the privacy of each user’s model gradient. We identify a combination of efficiency and robustness requirements which, to the best of our knowledge, are unmet by existing algorithms in the literature. We proceed to design a novel, communication-efficient Secure Aggregation protocol for high-dimensional data that tolerates up to 1/3 of users failing to complete the protocol. For 16-bit input values, our protocol offers 1.73× communication expansion for 2^10 users and 2^20-dimensional vectors, and 1.98× expansion for 2^14 users and 2^24-dimensional vectors.
15.00 Coffee Break
15.30 Poster Session
16.30 Richard Nock — Confidential Computing - Federate Private Data Analysis   [slides]
TBD
17.15 Dawn Song — Lessons and Open Challenges in Secure and Privacy-Preserving Machine Learning and Data Analytics
TBD
18.00 Wrap Up

Accepted Papers

Peter Kairouz, Sewoong Oh, Pramod Viswanath
Differentially Private Multi-party Computation
Martine De Cock, Rafael Dowsley, Anderson C. A. Nascimento and Stacey C. Newman
Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data
Mijung Park, James Foulds, Kamalika Chaudhuri, Max Welling
Private Topic Modeling    [PDF]
Michael Smith, Max Zwiessele, Neil Lawrence
Differentially Private Gaussian Processes   [PDF]
Christina Heinze-Deml, Brian McWilliams, Nicolai Meinshausen
Preserving Differential Privacy Between Features in Distributed Estimation
Morten Dahl, Valerio Pastro, Mathieu Poumeyrol
Private Data Aggregation on a Budget   [PDF]
Keith Bonawitz, Vladimir Ivanov, Ben Kreuter, Antonio Marcedone, Brendan McMahan, Sarvar Patel, Daniel Ramage, Aaron Segal, Karn Seth
Practical Secure Aggregation for Federated Learning on User-Held Data   [PDF]
Nicolas Papernot, Ulfar Erlingsson, Martin Abadi, Kunal Talwar, Ian Goodfellow
Machine Learning with Privacy by Knowledge Aggregation and Transfer    [PDF]
Hassan Takabi, Ehsan Hesamifard, Mehdi Ghasemi
Privacy Preserving Multi-party Machine Learning with Homomorphic Encryption    [PDF]
Lu Tian, Bargav Jayaraman, Quanquan Gu, David Evans
Aggregating Private Sparse Learning Models Using Multi-Party Computation    [PDF]
Phillipp Schoppmann, Adria Gascon, Mariana Raykova, David Evans, Samee Zahur, Jack Doerner, Borja Balle
Secure Distributed Linear Regression
Jakub Konečný, H. Brendan Mcmahan, Felix Yu, Peter Richtárik, Ananda Theertha Suresh, Dave Bacon
Federated Learning: Strategies for Improving Communication Efficiency    [PDF]

EPSRC Travel Grants

Grants are available to help partially cover the travel expenses of students and researchers affiliated with institutions in the United Kingdom attending the workshop. Each grant will reimburse registration costs and travel expenses up to a maximum of 800 pounds. We might be unable to provide awards to all applicants, in which case awards will be determined by the organizers based on the application material.

Applications are due on November 27, 2016.

An application for a travel award will consist of a single PDF file with a justification of financial needs, a summary of research interests, and a brief discussion of why the applicant will benefit from participating in the workshop. Please send your applications to agascon@inf.ed.ac.uk with the subject title "PMPML Travel Grant".

Sponsored by:      

Call For Papers & Important Dates

Download Full CFP Submit Your Abstract


Abstract submission: September 23 September 26, 2016 (11pm59 CET)
Notification of acceptance: October 3, 2016
NIPS early registration deadline: October 6, 2016
Workshop: December 9, 2016

We invite submissions of recent work on private and multi-party machine learning, both theory and application-oriented. Similarly to how NIPS and other NIPS workshops are organized, all accepted abstracts will be part of a poster session held during the workshop. Additionally, the PC will select a subset of the abstracts for short oral presentations. At least one author of each accepted abstract is expected to represent it at the workshop.

Submissions in the form of extended abstracts must be at most 4 pages long (not including references) and adhere to the NIPS format. We do accept submissions of work recently published or currently under review. Submissions do not need to be anonymized. The workshop will not have formal proceedings, but authors of accepted abstracts can choose to have their work published on the workshop webpage.

Organization


Workshop organizers

  • Borja Balle (Lancaster)
  • Aurélien Bellet (INRIA)
  • David Evans (Virginia)
  • Adrià Gascón (Edinburgh)

Program Committee

  • Myrto Arapinis (Edinburgh)
  • Louis Aslett (Oxford)
  • Emiliano De Cristofaro (UCL)
  • Jihun Hamm (Ohio State)
  • Yan Huang (Indiana)
  • Aggelos Kiayias (Edinburgh)
  • Mahnush Movahedi (Yale)
  • Jan Ramon (INRIA)

Sponsors