Publication:
Estimation and Optimization of Information Measures with Applications to Fairness and Differential Privacy

No Thumbnail Available

Date

2023-05-17

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Alghamdi, Wael Mohammed A. 2023. Estimation and Optimization of Information Measures with Applications to Fairness and Differential Privacy. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

My dissertation solves three theoretical problems on optimizing and estimating information measures, and it also builds on this theory to introduce novel practical algorithms for: 1) Optimal mechanism design for differential privacy (DP); 2) Optimal group-fair enhancement in machine learning; and 3) Estimation of information measures from data using sample moments. Information measures (in particular, f-divergences) provide a rigorous way to tackle several real-world problems. Examples include: 1) Quantifying the degree of privacy afforded by data releasing mechanisms---using the hockey-stick divergence; 2) Correcting machine learning (ML) trained classifiers for group-fairness---via optimizing cross-entropy; and 3) Detecting new dependencies between pairs of natural phenomena---via estimating mutual information from data. Herein, we put forth mathematically grounded approaches for the above three practical problems. In the first third of the dissertation, we design optimal DP mechanisms in the large-composition regime, and we also derive a fast and accurate DP accountant for the large-composition regime via the method of steepest descent from mathematical physics. We prove that the privacy parameter is equivalent to a KL-divergence term, then we provide solutions to the ensuing minmax KL-divergence problem. In the second third of the dissertation, we generalize the ubiquitous concept of information projection to the case of conditional distributions---which we term model projection. We derive explicit formulas for model projection, as well as a parallelizable algorithm to compute it efficiently and at scale. We instantiate our model projection theory to the domain of group-fair ML, thereby obtaining an optimal multi-class fairness enhancement method that runs in the order of seconds on datasets of size more than 1 million samples. In the last third of the dissertation, we derive the functional form of the relationship between information measures and the underlying moments. Plugging in the sample moments of data into our new moments-based formulas, we are able to estimate mutual information and differential entropy efficiently and robustly against affine-transformations of the samples.

Description

Other Available Sources

Keywords

differential privacy, estimation, f-divergence, group-fairness, information projection, moments, Applied mathematics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories