dc.contributor.advisor | Barak, Boaz | |
dc.contributor.advisor | Cox, David | |
dc.contributor.author | Bansal, Yamini | |
dc.date.accessioned | 2022-06-07T06:25:16Z | |
dc.date.created | 2022 | |
dc.date.issued | 2022-05-18 | |
dc.date.submitted | 2022-05 | |
dc.identifier.citation | Bansal, Yamini. 2022. Building the Theoretical Foundations of Deep Learning: An Empirical Approach. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences. | |
dc.identifier.other | 29209453 | |
dc.identifier.uri | https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37372168 | * |
dc.description.abstract | While tremendous practical progress has been made in deep learning, we lack a clear theoretical understanding of what makes deep learning work well, and why. In this thesis, we take a ``natural sciences'' approach towards building a theory for deep learning. We begin by identifying various empirical properties that emerge in practical deep networks across a variety of different settings. Then, we discuss how these empirical findings can be used to inform theory. Specifically, we show the following: (1) In contrast with supervised learning, state-of-the-art deep networks trained with self-supervised learning achieve bounded generalization gap under certain conditions, despite being over-parameterized. (2) Models with similar performance and architecture often converge to similar internal representations, even when their training method differs substantially (eg: supervised learning vs. self-supervised learning) (3) Interpolating classifiers obey a form of distributional generalization --- they converge to a type of conditional sampler from the training distribution. (4) The data scaling properties of deep networks are robust to changes in the architecture and noise levels of the training dataset.
Our findings highlight that despite the lack of worst-case guarantees, deep networks implicitly behave in a predictable, structured manner, thus laying the foundations for future theoretical analysis. | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dash.license | LAA | |
dc.subject | Deep Learning | |
dc.subject | Engineering | |
dc.title | Building the Theoretical Foundations of Deep Learning: An Empirical Approach | |
dc.type | Thesis or Dissertation | |
dash.depositing.author | Bansal, Yamini | |
dc.date.available | 2022-06-07T06:25:16Z | |
thesis.degree.date | 2022 | |
thesis.degree.grantor | Harvard University Graduate School of Arts and Sciences | |
thesis.degree.level | Doctoral | |
thesis.degree.name | Ph.D. | |
dc.contributor.committeeMember | Kakade, Sham | |
dc.type.material | text | |
thesis.degree.department | Engineering and Applied Sciences - Computer Science | |
dc.identifier.orcid | 0000-0002-1806-9298 | |
dash.author.email | yamini63bansal@gmail.com | |