High-Throughput Phenotyping With Electronic Medical Record Data Using a Common Semi-Supervised Approach (PheCAP)
View/ Open
Liao_AIP_1558627015_31_forSubmit.docx (188.1Kb)
s41596-019-0227-6.pdf (1.486Mb)
Access Status
Full text of the requested work is not available in DASH at this time ("restricted access"). For more information on restricted deposits, see our FAQ.Author
Ho, Yuk-Lam
Gainer, Vivian
Link, Nicholas
Honerlaw, Jacqueline
Huong, Sicong
Plenge, Robert
O'Donnell, Christopher
Published Version
https://doi.org/10.1038/s41596-019-0227-6Metadata
Show full item recordCitation
Zhang, Yichi, Tianrun Cai, Sheng Yu, Kelly Cho, Chuan Hong, Jiehuan Sun, Jie Huang, Yuk-Lam Ho, et al. 2019. High-throughput Phenotyping with Electronic Medical Record Data Using a Common Semi-supervised Approach (PheCAP). Nature Protocols 14, no. 12: 3426-444.Abstract
Phenotypes are the foundation for clinical and genetic studies of disease risk and outcomes. The growth of biobanks linked to electronic medical record (EMR) data has both facilitated and increased the demand for efficient, accurate, and robust approaches for phenotyping millions of patients. Challenges to phenotyping using EMR data include variation in the accuracy of codes, as well as the high level of manual input required to identify features for the algorithm and to obtain gold standard labels. To address these challenges, we developed PheCAP, a high-throughput semi-supervised phenotyping pipeline. PheCAP begins with data from the EMR, including structured data and information extracted from the narrative notes using natural language processing (NLP). The standardized steps integrate automated procedures reducing the level of manual input, and machine learning approaches for algorithm training. PheCAP itself can be executed in 1-2 days if all data are available; however, the timing is largely dependent on the chart review step which typically requires at least 2 weeks. The final products of PheCAP include a phenotype algorithm, the probability of the phenotype for all patients, and a phenotype classification (yes/no).Citable link to this page
http://nrs.harvard.edu/urn-3:HUL.InstRepos:42083016
Collections
- HMS Scholarly Articles [17714]
Contact administrator regarding this item (to report mistakes or request changes)