Learning over Molecules: Representations and Kernels
Citation
Sun, Hong Yang. 2014. Learning over Molecules: Representations and Kernels. Bachelor's thesis, Harvard College.Abstract
In this paper, we tackle machine learning over molecular space by considering three representations for molecules: (1) a vector of molecular properties that we treat as predictor variables, (2) a graph that captures the relationship between individual atoms in a molecule, and (3) a cheminformatic fingerprint that “identifies” a molecule. We assess the viability of each representation by training a model to predict energy values. In particular, we look a class of models that use kernel methods, whereby the prediction algorithm relies on a similarity measure between training data. On a subset of the Harvard Clean Energy Project (CEP) database, we find a simple fingerprint similarity kernel to be the fastest and most accurate for predicting HOMO-LUMO energy gap values.Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAACitable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:12705171
Collections
- FAS Theses and Dissertations [6136]
Contact administrator regarding this item (to report mistakes or request changes)