Publication:
Bridging Sociotechnical Gaps for Privacy-Preserving Data Science

No Thumbnail Available

Date

2023-05-17

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Sarathy, Jayshree. 2023. Bridging Sociotechnical Gaps for Privacy-Preserving Data Science. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.

Research Data

Abstract

Over the last two decades, a rapid rise in computational power, availability of data sources, and data-based systems has created new threats for the privacy of data subjects. This dissertation considers differential privacy (DP), a mathematical framework for guaranteeing that an algorithm reveals little about any individual data record, even to an attacker with additional information about the dataset. DP has a rich theoretical literature, but there have been many challenges when integrating the theory of DP with the practices of data science. In addition, recent deployments of DP across government and industry have highlighted the complexities around communicating the goals and guarantees of DP to stakeholders. This dissertation combines perspectives from Computer Science and Science & Technology Studies to offer new ways of understanding and addressing the challenges around practical, privacy-preserving data science. In particular, this thesis analyzes the tensions between theory and practice along empirical, mathematical, and sociotechnical dimensions. The contributions of this thesis include (1) empirically investigating the utility of differential privacy for social science researchers, highlighting several social and conceptual barriers to adoption, (2) designing and analyzing differentially private algorithms for fundamental, yet under-studied statistical tasks within the constraints of differential privacy, such as simple linear regression, and (3) exploring the entanglements between differential privacy as a mathematical formalization and its socio-political impacts in real-world settings. The dissertation concludes with suggestions for further technical and critical inquiry into the impacts of differential privacy in practice.

Description

Other Available Sources

Keywords

data science, differential privacy, linear regression, sociotechnical systems, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories