Show simple item record

dc.contributor.advisorBarak, Boaz
dc.contributor.advisorCox, David
dc.contributor.authorBansal, Yamini
dc.date.accessioned2022-06-07T06:25:16Z
dc.date.created2022
dc.date.issued2022-05-18
dc.date.submitted2022-05
dc.identifier.citationBansal, Yamini. 2022. Building the Theoretical Foundations of Deep Learning: An Empirical Approach. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
dc.identifier.other29209453
dc.identifier.urihttps://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37372168*
dc.description.abstractWhile tremendous practical progress has been made in deep learning, we lack a clear theoretical understanding of what makes deep learning work well, and why. In this thesis, we take a ``natural sciences'' approach towards building a theory for deep learning. We begin by identifying various empirical properties that emerge in practical deep networks across a variety of different settings. Then, we discuss how these empirical findings can be used to inform theory. Specifically, we show the following: (1) In contrast with supervised learning, state-of-the-art deep networks trained with self-supervised learning achieve bounded generalization gap under certain conditions, despite being over-parameterized. (2) Models with similar performance and architecture often converge to similar internal representations, even when their training method differs substantially (eg: supervised learning vs. self-supervised learning) (3) Interpolating classifiers obey a form of distributional generalization --- they converge to a type of conditional sampler from the training distribution. (4) The data scaling properties of deep networks are robust to changes in the architecture and noise levels of the training dataset. Our findings highlight that despite the lack of worst-case guarantees, deep networks implicitly behave in a predictable, structured manner, thus laying the foundations for future theoretical analysis.
dc.format.mimetypeapplication/pdf
dc.language.isoen
dash.licenseLAA
dc.subjectDeep Learning
dc.subjectEngineering
dc.titleBuilding the Theoretical Foundations of Deep Learning: An Empirical Approach
dc.typeThesis or Dissertation
dash.depositing.authorBansal, Yamini
dc.date.available2022-06-07T06:25:16Z
thesis.degree.date2022
thesis.degree.grantorHarvard University Graduate School of Arts and Sciences
thesis.degree.levelDoctoral
thesis.degree.namePh.D.
dc.contributor.committeeMemberKakade, Sham
dc.type.materialtext
thesis.degree.departmentEngineering and Applied Sciences - Computer Science
dc.identifier.orcid0000-0002-1806-9298
dash.author.emailyamini63bansal@gmail.com


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record