Building the Theoretical Foundations of Deep Learning: An Empirical Approach

Bansal, Yamini

dc.contributor.advisor	Barak, Boaz
dc.contributor.advisor	Cox, David
dc.contributor.author	Bansal, Yamini
dc.date.accessioned	2022-06-07T06:25:16Z
dc.date.created	2022
dc.date.issued	2022-05-18
dc.date.submitted	2022-05
dc.identifier.citation	Bansal, Yamini. 2022. Building the Theoretical Foundations of Deep Learning: An Empirical Approach. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
dc.identifier.other	29209453
dc.identifier.uri	https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37372168	*
dc.description.abstract	While tremendous practical progress has been made in deep learning, we lack a clear theoretical understanding of what makes deep learning work well, and why. In this thesis, we take a ``natural sciences'' approach towards building a theory for deep learning. We begin by identifying various empirical properties that emerge in practical deep networks across a variety of different settings. Then, we discuss how these empirical findings can be used to inform theory. Specifically, we show the following: (1) In contrast with supervised learning, state-of-the-art deep networks trained with self-supervised learning achieve bounded generalization gap under certain conditions, despite being over-parameterized. (2) Models with similar performance and architecture often converge to similar internal representations, even when their training method differs substantially (eg: supervised learning vs. self-supervised learning) (3) Interpolating classifiers obey a form of distributional generalization --- they converge to a type of conditional sampler from the training distribution. (4) The data scaling properties of deep networks are robust to changes in the architecture and noise levels of the training dataset. Our findings highlight that despite the lack of worst-case guarantees, deep networks implicitly behave in a predictable, structured manner, thus laying the foundations for future theoretical analysis.
dc.format.mimetype	application/pdf
dc.language.iso	en
dash.license	LAA
dc.subject	Deep Learning
dc.subject	Engineering
dc.title	Building the Theoretical Foundations of Deep Learning: An Empirical Approach
dc.type	Thesis or Dissertation
dash.depositing.author	Bansal, Yamini
dc.date.available	2022-06-07T06:25:16Z
thesis.degree.date	2022
thesis.degree.grantor	Harvard University Graduate School of Arts and Sciences
thesis.degree.level	Doctoral
thesis.degree.name	Ph.D.
dc.contributor.committeeMember	Kakade, Sham
dc.type.material	text
thesis.degree.department	Engineering and Applied Sciences - Computer Science
dc.identifier.orcid	0000-0002-1806-9298
dash.author.email	yamini63bansal@gmail.com

Files in this item

Name:: yamini_thesis-12.pdf
Size:: 17.19Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

FAS Theses and Dissertations [6136]

Show simple item record