Bayesian Models for Pooling Microarray Studies with Multiple Sources of Replications

DSpace/Manakin Repository

Bayesian Models for Pooling Microarray Studies with Multiple Sources of Replications

Citable link to this page


Title: Bayesian Models for Pooling Microarray Studies with Multiple Sources of Replications
Author: Conlon, Erin M; Song, Joon J; Liu, Jun

Note: Order does not necessarily reflect citation order of authors.

Citation: Conlon, Erin M., Joon J. Song, and Jun S. Liu. 2006. Bayesian models for pooling microarray studies with multiple sources of replications. BMC Bioinformatics 7:247.
Full Text & Related Files:
Abstract: Background: Biologists often conduct multiple but different cDNA microarray studies that all target the same biological system or pathway. Within each study, replicate slides within repeated identical experiments are often produced. Pooling information across studies can help more accurately identify true target genes. Here, we introduce a method to integrate multiple independent studies efficiently. Results: We introduce a Bayesian hierarchical model to pool cDNA microarray data across multiple independent studies to identify highly expressed genes. Each study has multiple sources of variation, i.e. replicate slides within repeated identical experiments. Our model produces the gene-specific posterior probability of differential expression, which provides a direct method for ranking genes, and provides Bayesian estimates of false discovery rates (FDR). In simulations combining two and five independent studies, with fixed FDR levels, we observed large increases in the number of discovered genes in pooled versus individual analyses. When the number of output genes is fixed (e.g., top 100), the pooled model found appreciably more truly differentially expressed genes than the individual studies. We were also able to identify more differentially expressed genes from pooling two independent studies in Bacillus subtilis than from each individual data set. Finally, we observed that in our simulation studies our Bayesian FDR estimates tracked the true FDRs very well. Conclusion: Our method provides a cohesive framework for combining multiple but not identical microarray studies with several sources of replication, with data produced from the same platform. We assume that each study contains only two conditions: an experimental and a control sample. We demonstrated our model's suitability for a small number of studies that have been either pre-scaled or have no outliers.
Published Version: doi:10.1186/1471-2105-7-247
Other Sources:
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at
Citable link to this page:

Show full Dublin Core record

This item appears in the following Collection(s)

  • FAS Scholarly Articles [8111]
    Peer reviewed scholarly articles from the Faculty of Arts and Sciences of Harvard University

Search DASH

Advanced Search