Decision forest: combining the predictions of multiple independent decision tree models

J Chem Inf Comput Sci. 2003 Mar-Apr;43(2):525-31. doi: 10.1021/ci020058s.

Abstract

The techniques of combining the results of multiple classification models to produce a single prediction have been investigated for many years. In earlier applications, the multiple models to be combined were developed by altering the training set. The use of these so-called resampling techniques, however, poses the risk of reducing predictivity of the individual models to be combined and/or over fitting the noise in the data, which might result in poorer prediction of the composite model than the individual models. In this paper, we suggest a novel approach, named Decision Forest, that combines multiple Decision Tree models. Each Decision Tree model is developed using a unique set of descriptors. When models of similar predictive quality are combined using the Decision Forest method, quality compared to the individual models is consistently and significantly improved in both training and testing steps. An example will be presented for prediction of binding affinity of 232 chemicals to the estrogen receptor.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Animals
  • Databases, Factual
  • Decision Trees*
  • Forecasting
  • Humans
  • Models, Chemical*
  • Protein Binding
  • Receptors, Estrogen / metabolism
  • Structure-Activity Relationship

Substances

  • Receptors, Estrogen