Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset

DSpace/Manakin Repository

Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset

Citable link to this page

 

 
Title: Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset
Author: Choe, Sung E; Boutros, Michael; Halfon, Marc S; Michelson, Alan D; Church, George McDonald

Note: Order does not necessarily reflect citation order of authors.

Citation: Choe, Sung E., Michael Boutros, Alan M. Michelson, George M. Church, and Marc S. Halfon. 2005. Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset. Genome Biology 6(2): R16.
Full Text & Related Files:
Abstract: Background: As more methods are developed to analyze RNA-profiling data, assessing their
performance using control datasets becomes increasingly important.
Results: We present a 'spike-in' experiment for Affymetrix GeneChips that provides a defined
dataset of 3,860 RNA species, which we use to evaluate analysis options for identifying differentially
expressed genes. The experimental design incorporates two novel features. First, to obtain
accurate estimates of false-positive and false-negative rates, 100-200 RNAs are spiked in at each
fold-change level of interest, ranging from 1.2 to 4-fold. Second, instead of using an uncharacterized
background RNA sample, a set of 2,551 RNA species is used as the constant (1x) set, allowing us
to know whether any given probe set is truly present or absent. Application of a large number of
analysis methods to this dataset reveals clear variation in their ability to identify differentially
expressed genes. False-negative and false-positive rates are minimized when the following options
are chosen: subtracting nonspecific signal from the PM probe intensities; performing an intensitydependent
normalization at the probe set level; and incorporating a signal intensity-dependent
standard deviation in the test statistic.
Conclusions: A best-route combination of analysis methods is presented that allows detection of
approximately 70% of true positives before reaching a 10% false-discovery rate. We highlight areas
in need of improvement, including better estimate of false-discovery rates and decreased falsenegative
rates.
Published Version: doi:10.1186/gb-2005-6-2-r16
Other Sources: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC551536/pdf/
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:4891682
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)

 
 

Search DASH


Advanced Search
 
 

Submitters