Publication: Statistical Mechanics of Generalization in Kernel Regression and Wide Neural Networks
No Thumbnail Available
Open/View Files
Date
2022-09-02
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Canatar, Abdulkadir. 2022. Statistical Mechanics of Generalization in Kernel Regression and Wide Neural Networks. Doctoral dissertation, Harvard University Graduate School of Arts and Sciences.
Research Data
Abstract
A theoretical understanding of generalization remains an open problem for many machine learning models, including deep neural networks. Here, we study this problem for kernel regression, which, besides being a popular machine learning method, also describes wide neural networks. We develop an analytical theory of generalization in kernel regression using replica theory of statistical mechanics, which is applicable to any kernel and data distribution. Experiments with practical kernels, including those arising from wide neural networks, show perfect agreement with our theory. We provide an in-depth analysis of our theory for kernel generalization. We show that kernel machines employ an inductive bias towards simple functions, preventing them to overfit the data. We characterize whether a kernel is compatible with a learning task in terms of sample efficiency. We identify a first order phase transition in our theory where more data may impair generalization when the task is noisy or not expressible by the kernel. We extend these results to out-of-distribution generalization and quantum kernel machines. We study representation learning in Bayesian Neural Networks using perturbation theory, and show that the features of wide neural networks receive corrections from the target labels.
Description
Other Available Sources
Keywords
Generalization Error, Kernel Regression, Replica Theory, Artificial intelligence, Statistical physics
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service