In silico phenotyping via co-training for improved phenotype prediction from genotype

DSpace/Manakin Repository

In silico phenotyping via co-training for improved phenotype prediction from genotype

Citable link to this page

 

 
Title: In silico phenotyping via co-training for improved phenotype prediction from genotype
Author: Roqueiro, Damian; Witteveen, Menno J.; Anttila, Verneri; Terwindt, Gisela M.; van den Maagdenberg, Arn M.J.M.; Borgwardt, Karsten

Note: Order does not necessarily reflect citation order of authors.

Citation: Roqueiro, Damian, Menno J. Witteveen, Verneri Anttila, Gisela M. Terwindt, Arn M.J.M. van den Maagdenberg, and Karsten Borgwardt. 2015. “In silico phenotyping via co-training for improved phenotype prediction from genotype.” Bioinformatics 31 (12): i303-i310. doi:10.1093/bioinformatics/btv254. http://dx.doi.org/10.1093/bioinformatics/btv254.
Full Text & Related Files:
Abstract: Motivation: Predicting disease phenotypes from genotypes is a key challenge in medical applications in the postgenomic era. Large training datasets of patients that have been both genotyped and phenotyped are the key requisite when aiming for high prediction accuracy. With current genotyping projects producing genetic data for hundreds of thousands of patients, large-scale phenotyping has become the bottleneck in disease phenotype prediction. Results: Here we present an approach for imputing missing disease phenotypes given the genotype of a patient. Our approach is based on co-training, which predicts the phenotype of unlabeled patients based on a second class of information, e.g. clinical health record information. Augmenting training datasets by this type of in silico phenotyping can lead to significant improvements in prediction accuracy. We demonstrate this on a dataset of patients with two diagnostic types of migraine, termed migraine with aura and migraine without aura, from the International Headache Genetics Consortium. Conclusions: Imputing missing disease phenotypes for patients via co-training leads to larger training datasets and improved prediction accuracy in phenotype prediction. Availability and implementation: The code can be obtained at: http://www.bsse.ethz.ch/mlcb/research/bioinformatics-and-computational-biology/co-training.html Contact: karsten.borgwardt@bsse.ethz.ch or menno.witteveen@bsse.ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online.
Published Version: doi:10.1093/bioinformatics/btv254
Other Sources: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4765855/pdf/
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:26318519
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)

 
 

Search DASH


Advanced Search
 
 

Submitters