NIPS 2016 Workshop
Barcelona, December 9
Centre de Convencions Internacional Barcelona
Room 131132
News: Slides from invited talks are now available!
The oneday workshop focuses on the problem of privacypreserving machine learning in scenarios where sensitive datasets are distributed across multiple data owners. Such distributed scenarios occur quite often in practice, for example when different parties contribute different records to a dataset, or information about each record in the dataset is held by different data owners.
Different communities have developed approaches to deal with this problem, including differential privacylike techniques where noisy sketches are exchanged between the parties, homomorphic encryption where operations are performed on encrypted data, and tailored approaches using techniques from the field of secure multiparty computation. The workshop will serve as a forum to unify different perspectives on this problem, explore the relative merits of each approach, and investigate future lines of research.
The workshop will have a particular emphasis in the decentralization aspect of privacypreserving machine learning. This includes a large number of realistic scenarios where the classical setup of differential privacy with a 'trusted curator' that prepares the data cannot be directly applied. The problem of privacypreserving computation gains relevance in this model, and effectively leveraging the tools developed by the cryptographic community to develop private multiparty learning algorithms poses a remarkable challenge.
8.45  Welcome and Introduction 
9.00  Mariana Raykova — Secure Computation: Why, How and When [slides] 
The goal of secure computation is to facilitate the evaluation of functionalities that depends on the private inputs of several distrusting parties in a privacy preserving manner which minimizes the information revealed about the inputs. In this talk we will introduce example problems motivating the work in the area of secure computation including problems related to machine learning. We will discuss how we formalize the notion of privacy in cryptographic protocols and how we prove privacy preserving properties for secure computation constructions. We will provide an overview of some main techniques and constructions for secure computation including Yao garbled circuits, approaches based an secret sharing and others. Lastly we will cover the different efficiency measures relevant for the practical use of secure computation protocols.


09.45  Stratis Ioannidis — Secure Function Evaluation at Scale [slides] 
Secure Function Evaluation (SFE) allows an interested party to evaluate a function over private data without learning anything about the inputs other than the outcome of this computation. This offers a strong privacy guarantee: SFE enables, e.g., a medical researcher, a statistician, or a data analyst, to conduct a study over private, sensitive data, without jeopardizing the privacy of the study's participants (patients, online users, etc.). Nevertheless, applying SFE to “big data” poses several challenges. First, beyond any computational overheads due to encryptions and decryptions, executing an algorithm securely may lead to a polynomial blowup in the total work (i.e., number of computation steps) compared to execution in the clear. For large datasets, even going from linear to quadratic time is prohibitive. Second, secure evaluations of algorithms should also maintain parallelizability: an algorithm that is easy to parallelize in the clear should also maintain this property in its SFE version, if its execution is to scale. Addressing this is challenging as communication patterns between processors often reveal a lot about the private inputs. In this talk, we show that several machine learning and data mining algorithms can be executed securely while leading to only (a) a logarithmic blowup in total work and (b) a logarithmic increase in the execution time when executed in parallel. Our techniques generalize to any algorithm that is graphparallel, i.e., can be expressed through a sequence of scatter, gather, and apply operations. This includes several algorithms of great practical interest, including page rank, matrix factorization, and training neural networks, to name a few.


10.30  Coffee Break 
11.00  Jack Doerner — An Introduction to Practical MPC [slides] 
The field of Secure Multiparty Computation (MPC) has seen an explosion of research over the past few years: though once a mostly theoretical idea, it has rapidly become a powerful, practical tool capable of efficiently solving realworld problems. However, this has come at the cost of dramatically increased complexity, expressed through a diversity of foundational techniques, highlevel systems, and implementation folk knowledge. This talk will address the practical aspects of multiparty computation, discussing a number of low level paradigms and their accompanying implementations, along with the various efficiency, functionality, and usability compromises that they offer. In addition, it will serve as an introduction to a set of tools and techniques that are commonly used in conjunction with generic MPC schemes, such as Oblivious RAM, permutation networks, and custom protocols. It is hoped that this will serve as a jumpingoffpoint, from which new problems can be addressed.


11.30  AnonML: Anonymous Machine Learning Over a Network of Data Holders (contributed talk) 
Centralized data warehouses can be disadvantageous to users for many reasons, including privacy, security, and control. We propose AnonML, a system for anonymous, peertopeer machine learning. At a high level, AnonML functions by moving as much computation as possible to its end users, away from a central authority. AnonML users store and compute features on their own data, thereby limiting the amount of information they need to share. To generate a model, a group of dataholding peers first agree on a model definition, a set of feature functions, and an aggregator, a peer who temporarily acts as a central authority. Each peer anonymously sends several small packets of labeled feature data to the aggregator. In exchange, the aggregator generates a classifier and shares it with the group. In this way, AnonML data holders control what information they share on a featurebyfeature and modelbymodel basis, and peers are able to disassociate features from their digital identities. Additionally, each peer can generate models suited to their particular needs, and the whole network benefits from the creation of novel, useful models.


11.50  Private Topic Modeling (contributed talk) 
We develop a privatised stochastic variational inference method for Latent Dirichlet Allocation (LDA). The iterative nature of stochastic variational inference presents challenges: multiple iterations are required to obtain accurate posterior distributions, yet each iteration increases the amount of noise that must be added to achieve a reasonable degree of privacy. We propose a practical algorithm that overcomes this challenge by combining: (1) A relaxed notion of the differential privacy, called concentrated differential privacy, which provides high probability bounds for cumulative privacy loss, which is well suited for iterative algorithms, rather than focusing on singlequery loss; and (2) privacy amplification resulting from subsampling of largescale data. Focusing on conjugate exponential family models, in our private variational inference, all the posterior distributions will be privatised by simply perturbing expected sufficient statistics. Using Wikipedia data, we illustrate the effectiveness of our algorithm for largescale data.


12.15  Poster Spotlights 
13.00  Lunch Break 
14.30  Practical Secure Aggregation for Federated Learning on UserHeld Data (contributed talk) 
Secure Aggregation is a class of Secure MultiParty Computation algorithms wherein a group of mutually distrustful parties u ∈ U each hold a private value x_u and collaborate to compute an aggregate value, such as the sum P = SUM(x_u, u∈U), without revealing to one another any information about their private value except what is learnable from the aggregate value itself. In this work, we consider training a deep neural network in the Federated Learning model, using distributed gradient descent across userheld training data on mobile devices, using Secure Aggregation to protect the privacy of each user’s model gradient. We identify a combination of efficiency and robustness requirements which, to the best of our knowledge, are unmet by existing algorithms in the literature. We proceed to design a novel, communicationefficient Secure Aggregation protocol for highdimensional data that tolerates up to 1/3 of users failing to complete the protocol. For 16bit input values, our protocol offers 1.73× communication expansion for 2^10 users and 2^20dimensional vectors, and 1.98× expansion for 2^14 users and 2^24dimensional vectors.


15.00  Coffee Break 
15.30  Poster Session 
16.30  Richard Nock — Confidential Computing  Federate Private Data Analysis [slides] 
TBD


17.15  Dawn Song — Lessons and Open Challenges in Secure and PrivacyPreserving Machine Learning and Data Analytics 
TBD


18.00  Wrap Up 
Grants are available to help partially cover the travel expenses of students and researchers affiliated with institutions in the United Kingdom attending the workshop. Each grant will reimburse registration costs and travel expenses up to a maximum of 800 pounds. We might be unable to provide awards to all applicants, in which case awards will be determined by the organizers based on the application material.
Applications are due on November 27, 2016.
An application for a travel award will consist of a single PDF file with a justification of financial needs, a summary of research interests, and a brief discussion of why the applicant will benefit from participating in the workshop. Please send your applications to agascon@inf.ed.ac.uk with the subject title "PMPML Travel Grant".
Abstract submission: September 23 September 26, 2016 (11pm59 CET)
Notification of acceptance: October 3, 2016
NIPS early registration deadline: October 6, 2016
Workshop: December 9, 2016
We invite submissions of recent work on private and multiparty machine learning, both theory and applicationoriented. Similarly to how NIPS and other NIPS workshops are organized, all accepted abstracts will be part of a poster session held during the workshop. Additionally, the PC will select a subset of the abstracts for short oral presentations. At least one author of each accepted abstract is expected to represent it at the workshop.
Submissions in the form of extended abstracts must be at most 4 pages long (not including references) and adhere to the NIPS format. We do accept submissions of work recently published or currently under review. Submissions do not need to be anonymized. The workshop will not have formal proceedings, but authors of accepted abstracts can choose to have their work published on the workshop webpage.