Visualization and recovery of the (bio)chemical interesting variables in data analysis with support vector machine classification

Patrick W T Krooshof; Bülent Ustün; Geert J Postma; Lutgarde M C Buydens

doi:10.1021/ac101338y

Visualization and recovery of the (bio)chemical interesting variables in data analysis with support vector machine classification

Anal Chem. 2010 Aug 15;82(16):7000-7. doi: 10.1021/ac101338y.

Authors

Patrick W T Krooshof¹, Bülent Ustün, Geert J Postma, Lutgarde M C Buydens

Affiliation

¹ Radboud University Nijmegen, Institute for Molecules and Materials, Analytical Chemistry, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands.

PMID: 20704390
DOI: 10.1021/ac101338y

Abstract

Support vector machines (SVMs) have become a popular technique in the chemometrics and bioinformatics field, and other fields, for the classification of complex data sets. Especially because SVMs are able to model nonlinear relationships, the usage of this technique has increased substantially. This modeling is obtained by mapping the data in a higher-dimensional feature space. The disadvantage of such a transformation is, however, that information about the contribution of the original variables in the classification is lost. In this paper we introduce an innovative method which can retrieve the information about the variables of complex data sets. We apply the proposed method to several benchmark data sets and a metabolomics data set to illustrate that we can determine the contribution of the original variables in SVM classifications. The corresponding visualization of the contribution of the variables can assist in a better understanding of the underlying chemical or biological process.

MeSH terms

Artificial Intelligence*
Computational Biology
Databases, Factual
Metabolomics
Models, Theoretical