Publication:
Finite-Precision Arithmetic Isn't Real: The Impact of Finite Data Types on Efforts to Fulfill Differential Privacy on Computers

No Thumbnail Available

Date

2022-05-25

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Wagaman, Connor James. 2022. Finite-Precision Arithmetic Isn't Real: The Impact of Finite Data Types on Efforts to Fulfill Differential Privacy on Computers. Bachelor's thesis, Harvard College.

Research Data

Abstract

Differential privacy (DP) has a strong theoretical foundation, but the path to implementing DP functions on computers is fraught with challenges. A common approach for creating a DP version of some function is to make a two-part algorithm: the algorithm first computes the true result of the function on a dataset, and the algorithm then draws noise from some probability distribution -- with the scale of this distribution based on the behavior of the function which computes the true result -- to perturb this true result and return an answer that both preserves the privacy of individuals in the dataset and provides utility for data analysts. Previous research has investigated the effects of finite-precision arithmetic on the ``noise-generation step'' of a DP algorithm. These efforts have revealed that finite-precision analogs of noise functions can behave far differently from their continuous counterparts, and that these differences can produce exploitable privacy violations. However, there has been little investigation into the effect of finite-precision arithmetic on the ``true answer step'' of a DP algorithm. In this paper, we describe how the true answer step can also behave far differently than expected, and how these differences in behavior can result in exploitable privacy violations. We demonstrate how a malicious data analyst can take advantage of these differences in behavior and use the outputs of supposedly DP functions to perform membership-inference attacks and reconstruction attacks on datasets. (These attacks affect all of the libraries of general-purpose DP functions that we investigated, including Google's DP library, IBM's diffprivlib, and Microsoft's SmartNoise.) We also describe how the functions in these libraries can be modified to offer privacy, and we prove the privacy of these modified functions.

Description

Other Available Sources

Keywords

differential privacy, finite data types, floating point, reconstruction attacks, summation, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories