Publication: Estimation and Optimization of Information Measures with Applications to Fairness and Differential Privacy
No Thumbnail Available
Open/View Files
Date
2023-05-17
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Alghamdi, Wael Mohammed A. 2023. Estimation and Optimization of Information Measures with Applications to Fairness and Differential Privacy. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
Research Data
Abstract
My dissertation solves three theoretical problems on optimizing and estimating information measures, and it also builds on this theory to introduce novel practical algorithms for: 1) Optimal mechanism design for differential privacy (DP); 2) Optimal group-fair enhancement in machine learning; and 3) Estimation of information measures from data using sample moments. Information measures (in particular, f-divergences) provide a rigorous way to tackle several real-world problems. Examples include: 1) Quantifying the degree of privacy afforded by data releasing mechanisms---using the hockey-stick divergence; 2) Correcting machine learning (ML) trained classifiers for group-fairness---via optimizing cross-entropy; and 3) Detecting new dependencies between pairs of natural phenomena---via estimating mutual information from data. Herein, we put forth mathematically grounded approaches for the above three practical problems. In the first third of the dissertation, we design optimal DP mechanisms in the large-composition regime, and we also derive a fast and accurate DP accountant for the large-composition regime via the method of steepest descent from mathematical physics. We prove that the privacy parameter is equivalent to a KL-divergence term, then we provide solutions to the ensuing minmax KL-divergence problem. In the second third of the dissertation, we generalize the ubiquitous concept of information projection to the case of conditional distributions---which we term model projection. We derive explicit formulas for model projection, as well as a parallelizable algorithm to compute it efficiently and at scale. We instantiate our model projection theory to the domain of group-fair ML, thereby obtaining an optimal multi-class fairness enhancement method that runs in the order of seconds on datasets of size more than 1 million samples. In the last third of the dissertation, we derive the functional form of the relationship between information measures and the underlying moments. Plugging in the sample moments of data into our new moments-based formulas, we are able to estimate mutual information and differential entropy efficiently and robustly against affine-transformations of the samples.
Description
Other Available Sources
Keywords
differential privacy, estimation, f-divergence, group-fairness, information projection, moments, Applied mathematics
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service