Publication:

Unsupervised Phenotyping of Severe Asthma Research Program Participants Using Expanded Lung Data

Loading...
Thumbnail Image

Date

2014-03-03

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier BV
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Wu, Wei, Eugene Bleecker, Wendy Moore, William W. Busse, Mario Castro, Kian Fan Chung, Serpil Erzurum et al. "Unsupervised Phenotyping of Severe Asthma Research Program Participants Using Expanded Lung Data." Journal of Allergy and Clinical Immunology 133, no. 5 (2014): 1280-1288. DOI: 10.1016/j.jaci.2013.11.042

Abstract

Background

Previous studies have identified asthma phenotypes based on small numbers of clinical, physiologic or inflammatory characteristics. However, no studies have utilized a wide range of variables using machine learning approaches.

Objectives

To identify subphenotypes of asthma utilizing blood, bronchoscopic, exhaled nitric oxide and clinical data from the Severe Asthma Research Program using unsupervised clustering, and then characterize them using supervised learning approaches.

Methods

Unsupervised clustering approaches were applied to 112 clinical, physiologic and inflammatory variables from 378 subjects. Variable selection and supervised learning techniques were employed to select relevant and nonredundant variables, address their predictive values, as well as the predictive value of the full variable set.

Results

Ten variable clusters and six subject clusters were identified, which differed and overlapped with previous clusters. Traditionally defined severe asthmatics distributed through subject Clusters 3–6. Cluster 4 identified early onset allergic asthmatics with low lung function and eosinophilic inflammation. Later onset, mostly severe asthmatics with nasal polyps and eosinophilia characterized Cluster 5. Cluster 6 asthmatics manifested persistent inflammation in blood and bronchoalveolar lavage and exacerbations despite high systemic corticosteroid use and side effects. Age of asthma onset, quality of life, symptoms, medications and health care utilization were some of the 51 nonredundant variables distinguishing subject clusters. These 51 variables classified test cases with 88% accuracy, compared to 93% accuracy with all 112 variables.

Conclusion

The unsupervised machine learning approaches used here provide unique insights into disease, confirming other approaches while revealing novel additional phenotypes.

Description

Research Data

Keywords

Immunology, Immunology and Allergy

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Related Stories