Interpretation of nonlinear QSAR models applied to Ames mutagenicity data

J Chem Inf Model. 2009 Nov;49(11):2551-8. doi: 10.1021/ci9002206.

Abstract

A method for local interpretation of QSAR models is presented and applied to an Ames mutagenicity data set. In the work presented, local interpretation of Support Vector Machine and Random Forest models is achieved by retrieving the variable corresponding to the largest component of the decision-function gradient at any point in the model. This contribution to the model is the variable that is regarded as having the most importance at that particular point in the model. The method described has been verified using two sets of simulated data and Ames mutagenicity data. This work indicates that it is possible to interpret nonlinear machine-learning methods. Comparison to an interpretable linear method is also presented.

MeSH terms

  • Algorithms
  • Models, Theoretical*
  • Mutagenicity Tests*
  • Quantitative Structure-Activity Relationship