Publication: Bridging Sociotechnical Gaps for Privacy-Preserving Data Science
No Thumbnail Available
Open/View Files
Date
2023-05-17
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Sarathy, Jayshree. 2023. Bridging Sociotechnical Gaps for Privacy-Preserving Data Science. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
Research Data
Abstract
Over the last two decades, a rapid rise in computational power, availability of data sources, and data-based systems has created new threats for the privacy of data subjects. This dissertation considers differential privacy (DP), a mathematical framework for guaranteeing that an algorithm reveals little about any individual data record, even to an attacker with additional information about the dataset. DP has a rich theoretical literature, but there have been many challenges when integrating the theory of DP with the practices of data science. In addition, recent deployments of DP across government and industry have highlighted the complexities around communicating the goals and guarantees of DP to stakeholders.
This dissertation combines perspectives from Computer Science and Science & Technology Studies to offer new ways of understanding and addressing the challenges around practical, privacy-preserving data science.
In particular, this thesis analyzes the tensions between theory and practice along empirical, mathematical, and sociotechnical dimensions. The contributions of this thesis include (1) empirically investigating the utility of differential privacy for social science researchers, highlighting several social and conceptual barriers to adoption, (2) designing and analyzing differentially private algorithms for fundamental, yet under-studied statistical tasks within the constraints of differential privacy, such as simple linear regression, and (3) exploring the entanglements between differential privacy as a mathematical formalization and its socio-political impacts in real-world settings. The dissertation concludes with suggestions for further technical and critical inquiry into the impacts of differential privacy in practice.
Description
Other Available Sources
Keywords
data science, differential privacy, linear regression, sociotechnical systems, Computer science
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service