Show simple item record

dc.contributor.authorBontempi, Gianluca
dc.contributor.authorHaibe-Kains, Benjamin
dc.contributor.authorDesmedt, Christine
dc.contributor.authorSotiriou, Christos
dc.contributor.authorQuackenbush, John
dc.date.accessioned2013-04-03T19:11:00Z
dc.date.issued2011
dc.identifier.citationBontempi, Gianluca, Benjamin Haibe-Kains, Christine Desmedt, Christos Sotiriou, and John Quackenbush. 2011. Multiple-input multiple-output causal strategies for gene selection. BMC Bioinformatics 12:458.en_US
dc.identifier.issn1471-2105en_US
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:10496304
dc.description.abstractBackground: Traditional strategies for selecting variables in high dimensional classification problems aim to find sets of maximally relevant variables able to explain the target variations. If these techniques may be effective in generalization accuracy they often do not reveal direct causes. The latter is essentially related to the fact that high correlation (or relevance) does not imply causation. In this study, we show how to efficiently incorporate causal information into gene selection by moving from a single-input single-output to a multiple-input multiple-output setting. Results: We show in synthetic case study that a better prioritization of causal variables can be obtained by considering a relevance score which incorporates a causal term. In addition we show, in a meta-analysis study of six publicly available breast cancer microarray datasets, that the improvement occurs also in terms of accuracy. The biological interpretation of the results confirms the potential of a causal approach to gene selection. Conclusions: Integrating causal information into gene selection algorithms is effective both in terms of prediction accuracy and biological interpretation.en_US
dc.language.isoen_USen_US
dc.publisherBioMed Centralen_US
dc.relation.isversionofdoi:10.1186/1471-2105-12-458en_US
dc.relation.hasversionhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC3323860/pdf/en_US
dash.licenseLAA
dc.subjectalgorithmsen_US
dc.subjectBayes theoremen_US
dc.subjectbreast neoplasmsen_US
dc.subjectgeneticsen_US
dc.subjectfemaleen_US
dc.subjectgene expression profilingen_US
dc.subjecthumansen_US
dc.subjectoligonucleotide array sequence analysisen_US
dc.subjectsoftwareen_US
dc.titleMultiple-Input Multiple-Output Causal Strategies for Gene Selectionen_US
dc.typeJournal Articleen_US
dc.description.versionVersion of Recorden_US
dc.relation.journalBMC Bioinformaticsen_US
dash.depositing.authorQuackenbush, John
dc.date.available2013-04-03T19:11:00Z
dc.identifier.doi10.1186/1471-2105-12-458*
dash.contributor.affiliatedQuackenbush, John
dc.identifier.orcid0000-0002-2702-5879


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record