Publication:

MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes

Loading...
Thumbnail Image

Open/View Files

Date

2017-04-10

Published Version

Journal Title

Journal ISSN

Volume Title

Publisher

American Chemical Society (ACS)
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Sung-jin Kim, Adrián, Alan Aspuru-Guzik. "MultiDK: A Multiple Descriptor Multiple Kernel Approach for Molecular Discovery and Its Application to Organic Flow Battery Electrolytes." J. Chem. Inf. Model. 57, no. 4 (2017): 657-668. DOI: 10.1021/acs.jcim.6b00332

Abstract

We propose a multiple descriptor multiple kernel (MultiDK) method for efficientmolecular discovery using machine learning. We show that the MultiDK method im-proves both the speed and the accuracy of molecular property prediction. We applythe method to the discovery of electrolyte molecules for aqueous redox flow batteries.Usingmultiple-type - as opposed to single-type - descriptors, more relevant featuresfor machine learning can be obtained. Following the principle of the ’wisdom of thecrowds’, the combination of multiple-type descriptors significantly boosts predictionperformance. Moreover, MultiDK can exploit irregularities between molecular struc-ture and property relations better than the linear regression method by employingmultiple kernels - more than one kernel functions for a set of the input descriptors.The multiple kernels consist of the Tanimoto similarity function and a linear kernelfor a set of binary descriptors and a set of non-binary descriptors, respectively. UsingMultiDK, we achieve average performance ofr2= 0.92 with a set of molecules for solubility prediction. We also extend MultiDK to predict pH-dependent solubility andapply it to solubility estimation of quinone molecules with ionizable functional groupsas strong candidates of flow battery electrolytes.

Description

Other Available Sources

Research Data

Keywords

Library and Information Sciences, Computer Science Applications, General Chemical Engineering, General Chemistry

Terms of Use

This article is made available under the terms and conditions applicable to Open Access Policy Articles (OAP), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories