Publication:
Exploring Machine Learning Applications to Enable Next-Generation Chemistry

No Thumbnail Available

Date

2019-01-16

Published Version

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Wei, Jennifer Nansean. 2019. Exploring Machine Learning Applications to Enable Next-Generation Chemistry. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.

Research Data

Abstract

As global demand for energy and materials grow while our dependence on petroleum and fossil fuels declines, it is necessary to revolutionize the way we make new materials. Machine learning provides several avenues for accelerating the discovery pipeline. Models employing machine learning optimization have already begun to accelerate materials discovery by identifying new candidates for organic LEDs, and predicting simple synthetic routes for organic molecules. Furthermore, researchers have used machine learning models to perform complicated tasks which were previously thought to be only possible by humans; such models can be leveraged to propose new molecular candidates. In my PhD work, I have developed machine learning models for three different challenges in chemistry. 1) I developed molecular autoencoders to decode molecular space from an order of 1060 to a 200-dimensional vector. In this vector representation, I demonstrate how we can use gradient descent and other optimization techniques to explore this space and find molecules that optimize target properties. 2) I built neural network models for predicting reactions within selected families of molecules, helping us to characterize the reactivity of a molecule. 3) I also developed a model which can predict electron-ionization mass spectra for small molecules in milliseconds, making it possible to expand the coverage of mass spectral libraries and what compounds can be identified with mass spectrometry. Together, these machine learning models represent a portion of how machine learning can be used to propose new molecules and to accelerate the identification of new molecules. As the field of machine learning develops, there will be many other possible applications to help accelerate the materials discovery platform.

Description

Other Available Sources

Keywords

machine learning, cheminformatics

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories