Improving Neural Networks with Generalizable Performance Predictors and Generative Code Language Models

Mishra, Aakash

View/Open

senior_thesis (2).pdf (4.468Mb)

nas-rec-engine-main.zip (118.6Mb)

Author

Mishra, Aakash

Metadata

Show full item record

Citation

Mishra, Aakash. 2023. Improving Neural Networks with Generalizable Performance Predictors and Generative Code Language Models. Bachelor's thesis, Harvard College.

Abstract

Neural Architecture Search (NAS) is a growing field with many evolving facets of research, from evaluation strategies and search space criterion to architecture optimization strategies and performance prediction. Currently, these spaces are disjoint and constrained due to lack of generalizability. Structured search spaces restrict algorithms to specific architectures, while performance estimators are fixed to given benchmarks without the ability to conduct zero-shot evaluation. Using advances in generative AI, we present a chimera of the aforementioned methods in a tool called NAS-Assistant. Our methodology consists of a new generalizable GNN-based neural architecture encoder and a clustering, attention-based regression network that predicts model performance with high accuracy and transferability. We also propose a unique method for evaluating the contribution of each layer of a network, combined with zero-cost NAS evaluation. Lastly, we develop a framework for using generative code language models to explore any model search space requested from NAS-Assistant. This thesis aims to demonstrate the first integrated generative AI optimizer for Neural Architecture Search.

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA

Citable link to this page

https://nrs.harvard.edu/URN-3:HUL.INSTREPOS:37376427

Collections

FAS Theses and Dissertations [6566]

Contact administrator regarding this item (to report mistakes or request changes)