Person: Bun, Mark Mar
Email Address
AA Acceptance Date
Birth Date
Research Projects
Organizational Units
Job Title
Last Name
First Name
Name
Search Results
Publication Fingerprinting codes and the price of approximate differential privacy
(Association of Computing Machinery, 2014) Bun, Mark Mar; Ullman, Jonathan; Vadhan, SalilWe show new lower bounds on the sample complexity of (ε, δ)-differentially private algorithms that accurately answer large sets of counting queries. A counting query on a database D ∈ ({0, 1}d)n has the form "What fraction of the individual records in the database satisfy the property q?" We show that in order to answer an arbitrary set Q of » nd counting queries on D to within error ±α it is necessary that [EQUATION] This bound is optimal up to poly-logarithmic factors, as demonstrated by the Private Multiplicative Weights algorithm (Hardt and Rothblum, FOCS'10). It is also the first to show that the sample complexity required for (ε, δ)-differential privacy is asymptotically larger than what is required merely for accuracy, which is O(log |Q|/α2). In addition, we show that our lower bound holds for the specific case of k-way marginal queries (where |Q| = 2k(d/k)) when α is a constant. Our results rely on the existence of short fingerprinting codes (Boneh and Shaw, CRYPTO'95; Tardos, STOC'03), which we show are closely connected to the sample complexity of differentially private data release. We also give a new method for combining certain types of sample complexity lower bounds into stronger lower bounds.
Publication Separating Computational and Statistical Differential Privacy in the Client-Server Model
(2016) Bun, Mark Mar; Chen, Yi-Hsiu; Vadhan, SalilDifferential privacy is a mathematical definition of privacy for statistical data analysis. It guarantees that any (possibly adversarial) data analyst is unable to learn too much information that is specific to an individual. Mironov et al. (CRYPTO 2009) proposed several computational relaxations of differential privacy (CDP), which relax this guarantee to hold only against computationally bounded adversaries. Their work and subsequent work showed that CDP can yield substantial accuracy improvements in various multiparty privacy problems. However, these works left open whether such improvements are possible in the traditional client-server model of data analysis. In fact, Groce, Katz and Yerukhimovich (TCC 2011) showed that, in this setting, it is impossible to take advantage of CDP for many natural statistical tasks. Our main result shows that, assuming the existence of sub-exponentially secure one-way functions and 2-message witness indistinguishable proofs (zaps) for NP, that there is in fact a computational task in the clientserver model that can be efficiently performed with CDP, but is infeasible to perform with information-theoretic differential privacy.
Publication Bridging the Gap between Computer Science and Legal Approaches to Privacy
(Harvard Law School, 2018) Nissim, Kobbi; Bembenek, Aaron; Wood, Alexandra; Bun, Mark Mar; Gaboardi, Marco; Gasser, Urs; O'Brien, David; Vadhan, Salil; Steinke, ThomasThe analysis and release of statistical data about individuals and groups of individuals carries inherent privacy risks, and these risks have been conceptualized in different ways within the fields of law and computer science. For instance, many information privacy laws adopt notions of privacy risk that are sector- or context-specific, such as in the case of laws that protect from disclosure certain types of information contained within health, educational, or financial records. In addition, many privacy laws refer to specific techniques, such as deidentification, that are designed to address a subset of possible attacks on privacy. In doing so, many legal standards for privacy protection rely on individual organizations to make case-by-case determinations regarding concepts such as the identifiability of the types of information they hold. These regulatory approaches are intended to be flexible, allowing organizations to (1) implement a variety of specific privacy measures that are appropriate given their varying institutional policies and needs, (2) adapt to evolving best practices, and (3) address a range of privacy-related harms. However, in the absence of clear thresholds and detailed guidance on making case-specific determinations, flexibility in the interpretation and application of such standards also creates uncertainty for practitioners and often results in ad hoc, heuristic processes. This uncertainty may pose a barrier to the adoption of new technologies that depend on unambiguous privacy requirements. It can also lead organizations to implement measures that fall short of protecting against the full range of data privacy risks.