Improving Neural Networks with Generalizable Performance Predictors and Generative Code Language Models
Author
Mishra, Aakash
Metadata
Show full item recordCitation
Mishra, Aakash. 2023. Improving Neural Networks with Generalizable Performance Predictors and Generative Code Language Models. Bachelor's thesis, Harvard College.Abstract
Neural Architecture Search (NAS) is a growing field with many evolving facets of research, from evaluation strategies and search space criterion to architecture optimization strategies and performance prediction. Currently, these spaces are disjoint and constrained due to lack of generalizability. Structured search spaces restrict algorithms to specific architectures, while performance estimators are fixed to given benchmarks without the ability to conduct zero-shot evaluation. Using advances in generative AI, we present a chimera of the aforementioned methods in a tool called NAS-Assistant. Our methodology consists of a new generalizable GNN-based neural architecture encoder and a clustering, attention-based regression network that predicts model performance with high accuracy and transferability. We also propose a unique method for evaluating the contribution of each layer of a network, combined with zero-cost NAS evaluation. Lastly, we develop a framework for using generative code language models to explore any model search space requested from NAS-Assistant. This thesis aims to demonstrate the first integrated generative AI optimizer for Neural Architecture Search.Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37376427
Collections
- FAS Theses and Dissertations [6566]
Contact administrator regarding this item (to report mistakes or request changes)