Show simple item record

dc.contributor.authorHuang, Yen-Tsungen_US
dc.contributor.authorLin, Xihongen_US
dc.date.accessioned2014-03-10T16:16:36Z
dc.date.issued2013en_US
dc.identifier.citationHuang, Yen-Tsung, and Xihong Lin. 2013. “Gene set analysis using variance component tests.” BMC Bioinformatics 14 (1): 210. doi:10.1186/1471-2105-14-210. http://dx.doi.org/10.1186/1471-2105-14-210.en
dc.identifier.issn1471-2105en
dc.identifier.urihttp://nrs.harvard.edu/urn-3:HUL.InstRepos:11877026
dc.description.abstractBackground: Gene set analyses have become increasingly important in genomic research, as many complex diseases are contributed jointly by alterations of numerous genes. Genes often coordinate together as a functional repertoire, e.g., a biological pathway/network and are highly correlated. However, most of the existing gene set analysis methods do not fully account for the correlation among the genes. Here we propose to tackle this important feature of a gene set to improve statistical power in gene set analyses. Results: We propose to model the effects of an independent variable, e.g., exposure/biological status (yes/no), on multiple gene expression values in a gene set using a multivariate linear regression model, where the correlation among the genes is explicitly modeled using a working covariance matrix. We develop TEGS (Test for the Effect of a Gene Set), a variance component test for the gene set effects by assuming a common distribution for regression coefficients in multivariate linear regression models, and calculate the p-values using permutation and a scaled chi-square approximation. We show using simulations that type I error is protected under different choices of working covariance matrices and power is improved as the working covariance approaches the true covariance. The global test is a special case of TEGS when correlation among genes in a gene set is ignored. Using both simulation data and a published diabetes dataset, we show that our test outperforms the commonly used approaches, the global test and gene set enrichment analysis (GSEA). Conclusion: We develop a gene set analyses method (TEGS) under the multivariate regression framework, which directly models the interdependence of the expression values in a gene set using a working covariance. TEGS outperforms two widely used methods, GSEA and global test in both simulation and a diabetes microarray data.en
dc.language.isoen_USen
dc.publisherBioMed Centralen
dc.relation.isversionofdoi:10.1186/1471-2105-14-210en
dc.relation.hasversionhttp://www.ncbi.nlm.nih.gov/pmc/articles/PMC3776447/pdf/en
dash.licenseLAAen_US
dc.titleGene set analysis using variance component testsen
dc.typeJournal Articleen_US
dc.description.versionVersion of Recorden
dc.relation.journalBMC Bioinformaticsen
dash.depositing.authorLin, Xihongen_US
dc.date.available2014-03-10T16:16:36Z
dc.identifier.doi10.1186/1471-2105-14-210*
dash.contributor.affiliatedLin, Xihong


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record