Comparison of feature selection and classification for MALDI-MS data

dc.contributor.authorLiu, Qingzhong
dc.contributor.authorSung, Andrew H.
dc.contributor.authorChen, Zhongxue
dc.contributor.authorYang, Jack Y
dc.contributor.authorQiao, Mengyu
dc.contributor.authorYang, Mary Qu
dc.contributor.authorHuang, Xudong
dc.contributor.authorDeng, Youping
dc.date.accessioned2022-01-25T16:57:57Z
dc.date.available2022-01-25T16:57:57Z
dc.date.issued2009-07-07
dc.descriptionThis article was originally published in BMC Genomics in 2009. doi:10.1186/1471-2164-10-S1-S3
dc.description.abstractIntroduction: In the classification of Mass Spectrometry (MS) proteomics data, peak detection, feature selection, and learning classifiers are critical to classification accuracy. To better understand which methods are more accurate when classifying data, some publicly available peak detection algorithms for Matrix assisted Laser Desorption Ionization Mass Spectrometry (MALDI-MS) data were recently compared; however, the issue of different feature selection methods and different classification models as they relate to classification performance has not been addressed. With the application of intelligent computing, much progress has been made in the development of feature selection methods and learning classifiers for the analysis of high-throughput biological data. The main objective of this paper is to compare the methods of feature selection and different learning classifiers when applied to MALDI-MS data and to provide a subsequent reference for the analysis of MS proteomics data. Results: We compared a well-known method of feature selection, Support Vector Machine Recursive Feature Elimination (SVMRFE), and a recently developed method, Gradient based Leaveone-out Gene Selection (GLGS) that effectively performs microarray data analysis. We also compared several learning classifiers including K-Nearest Neighbor Classifier (KNNC), Naive Bayes Classifier (NBC), Nearest Mean Scaled Classifier (NMSC), uncorrelated normal based quadratic Bayes Classifier recorded as UDC, Support Vector Machines, and a distance metric learning for Large Margin Nearest Neighbor classifier (LMNN) based on Mahanalobis distance. To compare, we conducted a comprehensive experimental study using three types of MALDI-MS data. Conclusion: Regarding feature selection, SVMRFE outperformed GLGS in classification. As for the learning classifiers, when classification models derived from the best training were compared, SVMs performed the best with respect to the expected testing accuracy. However, the distance metric learning LMNN outperformed SVMs and other classifiers on evaluating the best testing. In such cases, the optimum classification model based on LMNN is worth investigating for future study.
dc.description.sponsorshipMississippi Functional Genomics Network (DHHS/NIH/NCRR Grant# 2P20RR016476-04).
dc.description.subjectGradient based Leave-one-out Gene Selection
dc.description.subjectmicroarray data analysis
dc.description.subjectmahanalobis distance
dc.description.subjectLarge Margin Nearest Neighbor classifier
dc.identifier.citationLiu Q, Sung AH, Qiao M, Chen Z, Yang J, Yang M, Huang X, Deng Y (2009). Comparison of feature selection and classification of MALDI-MS data, BMC Genomics, 10 (Suppl 1): S3. doi:10.1186/1471-2164-10-S1-S3
dc.identifier.urihttps://hdl.handle.net/20.500.11875/3261
dc.language.isoen_US
dc.publisherBMC Genomics
dc.subjectpeak detection algorithms
dc.subjectMass Spectrometry (MS)
dc.subjectLaser Desorption Ionization Mass Spectrometry (MALDI-MS)
dc.subjectclassification
dc.subjectMS proteomics data
dc.titleComparison of feature selection and classification for MALDI-MS data
dc.typeArticle

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Comparison of feature selection and classification for MALDI-MS_OCR.pdf
Size:
1.07 MB
Format:
Adobe Portable Document Format
Description:
Article

License bundle

Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.63 KB
Format:
Item-specific license agreed upon to submission
Description: