Publication:
EMDomics: a robust and powerful method for the identification of genes differentially expressed between heterogeneous classes

Thumbnail Image

Date

2015

Journal Title

Journal ISSN

Volume Title

Publisher

Oxford University Press
The Harvard community has made this article openly available. Please share how this access benefits you.

Research Projects

Organizational Units

Journal Issue

Citation

Nabavi, Sheida, Daniel Schmolze, Mayinuer Maitituoheti, Sadhika Malladi, and Andrew H. Beck. 2015. “EMDomics: a robust and powerful method for the identification of genes differentially expressed between heterogeneous classes.” Bioinformatics 32 (4): 533-541. doi:10.1093/bioinformatics/btv634. http://dx.doi.org/10.1093/bioinformatics/btv634.

Research Data

Abstract

Motivation: A major goal of biomedical research is to identify molecular features associated with a biological or clinical class of interest. Differential expression analysis has long been used for this purpose; however, conventional methods perform poorly when applied to data with high within class heterogeneity. Results: To address this challenge, we developed EMDomics, a new method that uses the Earth mover’s distance to measure the overall difference between the distributions of a gene’s expression in two classes of samples and uses permutations to obtain q-values for each gene. We applied EMDomics to the challenging problem of identifying genes associated with drug resistance in ovarian cancer. We also used simulated data to evaluate the performance of EMDomics, in terms of sensitivity and specificity for identifying differentially expressed gene in classes with high within class heterogeneity. In both the simulated and real biological data, EMDomics outperformed competing approaches for the identification of differentially expressed genes, and EMDomics was significantly more powerful than conventional methods for the identification of drug resistance-associated gene sets. EMDomics represents a new approach for the identification of genes differentially expressed between heterogeneous classes and has utility in a wide range of complex biomedical conditions in which sample classes show within class heterogeneity. Availability and implementation: The R package is available at http://www.bioconductor.org/packages/release/bioc/html/EMDomics.html Contact: abeck2@bidmc.harvard.edu Supplementary information: supplementary data are available at Bioinformatics online.

Description

Keywords

Terms of Use

This article is made available under the terms and conditions applicable to Other Posted Material (LAA), as set forth at Terms of Service

Endorsement

Review

Supplemented By

Referenced By

Related Stories