Aneuploidy Prediction and Tumor Classification with Heterogeneous Hidden Conditional Random Fields

DSpace/Manakin Repository

Aneuploidy Prediction and Tumor Classification with Heterogeneous Hidden Conditional Random Fields

Show simple item record

dc.contributor.author Schapire, Robert E.
dc.contributor.author Barutcuoglu, Zafer
dc.contributor.author Airoldi, Edoardo
dc.contributor.author Dumeaux, Vanessa
dc.contributor.author Troyanskaya, Olga G.
dc.date.accessioned 2009-03-27T18:49:01Z
dc.date.issued 2008
dc.identifier.citation Barutcuoglu, Zafer, Edoardo M. Airoldi, Vanessa Dumeaux, Robert E. Schapire, and Olga G. Troyanskaya. 2008. Aneuploidy Prediction and Tumor Classification with Heterogeneous Hidden Conditional Random Fields. Bioinformatics Advance Access published on December 4, 2008, DOI 10.1093/bioinformatics/btn585. en
dc.identifier.issn 1367-4803 en
dc.identifier.issn 1460-2059 en
dc.identifier.uri http://nrs.harvard.edu/urn-3:HUL.InstRepos:2757289
dc.description.abstract Motivation: The heterogeneity of cancer cannot always be recognized by tumor morphology, but may be reflected by the underlying genetic aberrations. Array-CGH methods provide highthroughput data on genetic copy numbers, but determining the clinically relevant copy number changes remains a challenge. Conventional classification methods for linking recurrent alterations to clinical outcome ignore sequential correlations in selecting relevant features. Conversely, existing sequence classification methods can only model overall copy number instability, without regard to any particular position in the genome. Results: Here we present the Heterogeneous Hidden Conditional Random Field, a new integrated array-CGH analysis method for jointly classifying tumors, inferring copy numbers, and identifying clinically relevant positions in recurrent alteration regions. By capturing the sequentiality as well as the locality of changes, our integrated model provides better noise reduction, and achieves more relevant gene retrieval and more accurate classification than existing methods. We provide an efficient L1-regularized discriminative training algorithm, which notably selects a small set of candidate genes most likely to be clinically relevant and driving the recurrent amplicons of importance. Our method thus provides unbiased starting points in deciding which genomic regions and which genes in particular to pursue for further examination. Our experiments on synthetic data and real genomic cancer prediction data show that our method is superior, both in prediction accuracy and relevant feature discovery, to existing methods. We also demonstrate that it can be used to generate novel biological hypotheses for breast cancer. en
dc.description.sponsorship Statistics en
dc.language.iso en_US en
dc.publisher Oxford University Press en
dc.relation.isversionof http://dx.doi.org/10.1093/bioinformatics/btn585 en
dash.license OAP
dc.title Aneuploidy Prediction and Tumor Classification with Heterogeneous Hidden Conditional Random Fields en
dc.type Journal Article
dc.description.version Version of Record
dc.relation.journal Bioinformatics en
dash.depositing.author Airoldi, Edoardo

Files in this item

Files Size Format View
Barutcuoglu_Aneuploidy.pdf 561.0Kb PDF View/Open

This item appears in the following Collection(s)

  • FAS Scholarly Articles [6885]
    Peer reviewed scholarly articles from the Faculty of Arts and Sciences of Harvard University

Show simple item record

 
 

Search DASH


Advanced Search
 
 

Submitters