Publication: Exploring Machine Learning Applications to Enable Next-Generation Chemistry
No Thumbnail Available
Open/View Files
Date
2019-01-16
Authors
Published Version
Published Version
Journal Title
Journal ISSN
Volume Title
Publisher
The Harvard community has made this article openly available. Please share how this access benefits you.
Citation
Wei, Jennifer Nansean. 2019. Exploring Machine Learning Applications to Enable Next-Generation Chemistry. Doctoral dissertation, Harvard University, Graduate School of Arts & Sciences.
Research Data
Abstract
As global demand for energy and materials grow while our dependence on petroleum and fossil fuels declines, it is necessary to revolutionize the way we make new materials. Machine learning provides several avenues for accelerating the discovery pipeline. Models employing machine learning optimization have already begun to accelerate materials discovery by identifying new candidates for organic LEDs, and predicting simple synthetic routes for organic molecules. Furthermore, researchers have used machine learning models to perform complicated tasks which were previously thought to be only possible by humans; such models can be leveraged to propose new molecular candidates.
In my PhD work, I have developed machine learning models for three different challenges in chemistry.
1) I developed molecular autoencoders to decode molecular space from an order of 1060 to a 200-dimensional vector. In this vector representation, I demonstrate how we can use gradient descent and other optimization techniques to explore this space and find molecules that optimize target properties.
2) I built neural network models for predicting reactions within selected families of molecules, helping us to characterize the reactivity of a molecule.
3) I also developed a model which can predict electron-ionization mass spectra for small molecules in milliseconds, making it possible to expand the coverage of mass spectral libraries and what compounds can be identified with mass spectrometry.
Together, these machine learning models represent a portion of how machine learning can be used to propose new molecules and to accelerate the identification of new molecules. As the field of machine learning develops, there will be many other possible applications to help accelerate the materials discovery platform.
Description
Other Available Sources
Keywords
machine learning, cheminformatics
Terms of Use
This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service