Publication:

Learning over Molecules: Representations and Kernels

Loading...
Thumbnail Image

Date

2014-08-08

Authors

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Sun, Hong Yang. 2014. Learning over Molecules: Representations and Kernels. Bachelor's thesis, Harvard College.

Abstract

In this paper, we tackle machine learning over molecular space by considering three representations for molecules: (1) a vector of molecular properties that we treat as predictor variables, (2) a graph that captures the relationship between individual atoms in a molecule, and (3) a cheminformatic fingerprint that “identifies” a molecule. We assess the viability of each representation by training a model to predict energy values. In particular, we look a class of models that use kernel methods, whereby the prediction algorithm relies on a similarity measure between training data. On a subset of the Harvard Clean Energy Project (CEP) database, we find a simple fingerprint similarity kernel to be the fastest and most accurate for predicting HOMO-LUMO energy gap values.

Description

Other Available Sources

Research Data

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories