Publication: Analysis of the Harvard Computer Society Email Archives: An Exploration of Differential Privacy in Practice
Open/View Files
Date
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
Citation
Abstract
This thesis provides a rudimentary introduction to differential privacy as a framework for modern data privacy, using the Harvard Computer Society email list archives as an investigative medium. The differentially private analysis of this dataset includes but is not limited to: time series of list usage, email topic modeling, and sentiment analysis. OpenDP’s Python package for differential privacy is used extensively to execute computations, and the API is evaluated as a standalone programming framework within itself. Novel graph differential private algorithms are both implemented and empirically assessed. Lastly, this thesis discusses a significant inherent challenge in balancing contrasting aspects of differential privacy and exploratory data analysis.