Publication:

Differentially Private Ridge Regression: The Cost of a Hyperparameter

Loading...
Thumbnail Image

Date

2021-06-04

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Piazza, Tyler. 2021. Differentially Private Ridge Regression: The Cost of a Hyperparameter. Bachelor's thesis, Harvard College.

Abstract

Studying problems of interest, like finding trends in medical data, can require analyzing data which contains sensitive and personally identifying information. As a result, it is often infeasible to release these datasets to researchers or to the general public. In this paper, we study algorithms that are differentially private, where there are theoretical guarantees that the mechanisms studied will reveal only limited amounts of information about individual people while still providing insights about large groups. This thesis discusses various forms of linear and ridge regression in this differentially private setting, with the goal of studying sensitive data to make predictions about future sensitive data. In particular, we will discuss the internal privacy-loss budgeting of the differentially private ridge regression technique adaSSP. This thesis provides 3 contributions. First, we discuss the existing SSP and adaSSP algorithms, and provide detailed proofs that they are each differentially private. Second, we introduce the two new algorithms adaSSPbudget and constSSPfull and prove that these are each differentially private. Third, we conduct experiments using synthetic and real world data to explore whether the precise privacy-loss budgeting used within these algorithms could improve their performance. These experiments will explore the tradeoff between the accuracy of a hyperparameter and the accuracy of the other releases. Through the experimental results, we find that the performance is often insensitive to the particular privacy-loss budgeting and that for certain datasets, no choice of privacy-loss budget allows for the adaptive adaSSPbudget to outperform the standard SSP algorithm.

Description

Other Available Sources

Research Data

Keywords

Differential Privacy, Hyperparameter, Linear Regression, Machine Learning, Privacy Budget, Ridge Regression, Computer science

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories