Gene Selection and Classification for Cancer Microarray Data Based on Machine Learning and Similarity Measures

DSpace/Manakin Repository

Gene Selection and Classification for Cancer Microarray Data Based on Machine Learning and Similarity Measures

Citable link to this page

 

 
Title: Gene Selection and Classification for Cancer Microarray Data Based on Machine Learning and Similarity Measures
Author: Liu, Qingzhong; Sung, Andrew H; Chen, Zhongxue; Liu, Jianzhong; Chen, Lei; Qiao, Mengyu; Wang, Zhaohui; Deng, Youping; Huang, Xudong

Note: Order does not necessarily reflect citation order of authors.

Citation: Liu, Qingzhong, Andrew H. Sung, Zhongxue Chen, Jianzhong Liu, Lei Chen, Mengyu Qiao, Zhaohui Wang, Xudong Huang, and Youping Deng. 2011. Gene selection and classification for cancer microarray data based on machine learning and similarity measures. BMC Genomics 12(Suppl. 5): S1.
Full Text & Related Files:
Abstract: Background: Microarray data have a high dimension of variables and a small sample size. In microarray data analyses, two important issues are how to choose genes, which provide reliable and good prediction for disease status, and how to determine the final gene set that is best for classification. Associations among genetic markers mean one can exploit information redundancy to potentially reduce classification cost in terms of time and money. Results: To deal with redundant information and improve classification, we propose a gene selection method, Recursive Feature Addition, which combines supervised learning and statistical similarity measures. To determine the final optimal gene set for prediction and classification, we propose an algorithm, Lagging Prediction Peephole Optimization. By using six benchmark microarray gene expression data sets, we compared Recursive Feature Addition with recently developed gene selection methods: Support Vector Machine Recursive Feature Elimination, Leave-One-Out Calculation Sequential Forward Selection and several others. Conclusions: On average, with the use of popular learning machines including Nearest Mean Scaled Classifier, Support Vector Machine, Naive Bayes Classifier and Random Forest, Recursive Feature Addition outperformed other methods. Our studies also showed that Lagging Prediction Peephole Optimization is superior to random strategy; Recursive Feature Addition with Lagging Prediction Peephole Optimization obtained better testing accuracies than the gene selection method varSelRF.
Published Version: doi:10.1186/1471-2164-12-S5-S1
Other Sources: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3287491/pdf/
Terms of Use: This article is made available under the terms and conditions applicable to Other Posted Material, as set forth at http://nrs.harvard.edu/urn-3:HUL.InstRepos:dash.current.terms-of-use#LAA
Citable link to this page: http://nrs.harvard.edu/urn-3:HUL.InstRepos:10318285
Downloads of this work:

Show full Dublin Core record

This item appears in the following Collection(s)

 
 

Search DASH


Advanced Search
 
 

Submitters