The proliferation of online sensitive data about individuals and organizations makes concern about the privacy of these data a top priority. There have been many formulations of privacy and, unfortunately, many negative results about the feasibility of maintaining privacy of sensitive data in realistic networked environments. We formulate communication-complexity-based definitions, both worst case and average case, of a problem’s privacy-approximation ratio . We use our definitions to investigate the extent to which approximate privacy is achievable in a number of standard problems: the 2 nd -price Vickrey auction, Yao’s millionaires problem, the public-good problem, and the set-theoretic disjointness and intersection problems. For both the 2 nd -price Vickrey auction and the millionaires problem, we show that not only is perfect privacy impossible or infeasibly costly to achieve, but even close approximations of perfect privacy suffer from the same lower bounds. By contrast, if the inputs are drawn uniformly at random from { 0,…, 2 k -1}, then, for both problems, simple and natural communication protocols have privacy-approximation ratios that are linear in k (i.e., logarithmic in the size of the input space). We also demonstrate tradeoffs between privacy and communication in a family of auction protocols. We show that the privacy-approximation ratio provided by any protocol for the disjointness and intersection problems is necessarily exponential (in k ). We also use these ratios to argue that one protocol for each of these problems is significantly fairer than the others we consider (in the sense of relative effects on the privacy of the different players).
We investigate the importance of space when solving problems based on graph distance in the streaming model. In this model, the input graph is presented as a stream of edges in an arbitrary order. The main computational restriction of the model is that we have limited space and therefore cannot store all the streamed data; we are forced to make space-efficient summaries of the data as we go along. For a graph of n vertices and m edges, we show that testing many graph properties, including connectivity (ergo any reasonable decision problem about distances) and bipartiteness, requires Ω(n) bits of space. Given this, we then investigate how the power of the model increases as we relax our space restriction. Our main result is an efficient randomized algorithm that constructs a (2t + 1)-spanner in one pass. With high probability, it uses O(t .n1+1/t log2n) bits of space and processes each edge in the stream in O(t2·n1/t log n) time. We find approximations to diameter and girth via the constructed spanner. For t = Ω(log n/log log n), the space requirement of the algorithm is O(n .polylog n), and the per-edge processing time is O(polylog n). We also show a corresponding lower bound of t for the approximation ratio achievable when the space restriction is O(t.n1+1/t log2n).We then consider the scenario in which we are allowed multiple passes over the input stream. Here, we investigate whether allowing these extra passes will compensate for a given space restriction. We show that finding vertices at distance d from a particular vertex will always take d passes, for all d ∈ {1,...,t/2}, when the space restriction is o(n1+1/t). For girth, we show the existence of a direct trade-off between space and passes in the form of a lower bound on the product of the space requirement and number of passes. Finally, we conclude with two general techniques for speeding up the per-edge computation time of streaming algorithms while increasing the space by at most a log factor.
Although machine learning (ML) is widely used for predictive tasks, there are important scenarios in which ML cannot be used or at least cannot achieve its full potential. A major barrier to adoption is the sensitive nature of predictive queries. Individual users may lack sufficiently rich datasets to train accurate models locally but also be unwilling to send sensitive queries to commercial services that vend such models. One central goal of privacy-preserving machine learning (PPML) is to enable users to submit encrypted queries to a remote ML service, receive encrypted results, and decrypt them locally. We aim at developing practical solutions for real-world privacy-preserving ML inference problems. In this paper, we propose a privacy-preserving XGBoost prediction algorithm, which we have implemented and evaluated empirically on AWS SageMaker. Experimental results indicate that our algorithm is efficient enough to be used in real ML production environments.
We address two questions about self-reducibility-the power of adaptiveness in examiners that take advice and the relationship between random-self-reducibility and self-correctability. We first show that adaptive examiners are more powerful than nonadaptive examiners, even if the nonadaptive ones are nonuniform. Blum et al. (1993) showed that every random-self-reducible function is self-correctable. However, whether self-correctability implies random-self-reducibility is unknown. We show that, under a reasonable complexity hypothesis, there exists a self-correctable function that is not random-self-reducible. For P-sampleable distributions, however, we show that constructing a self-correctable function that is not random-self-reducible is as hard as proving that P/spl ne/PP.
We initiate an investigation of probabilistically checkable debateAbstract-1 systems (PCDS), a natural generalization of probabilistically checkable proof systems (PCPS).A PCDS for a language L consists of a probabilistic polynomial-time verifier V and a debate between player 1, who claims that the input x is in L, and player 0, who claims that the input x is not in L. We show that there is a PCDS for L in which V flips O(log n) random coins and reads O(1) bits of the debate if and only if L is in PSPACE.This characterization of PSPACE is used to show that certain PSPACE-hard functions are as hard to approximate closely as they are to compute exactly.These results first appeared in our Technical Memorandum [CFLS93a].