The effect of reduction in cross-validation intervals on the performance of multifactor dimensionality reduction

Genet Epidemiol. 2006 Sep;30(6):546-55. doi: 10.1002/gepi.20166.

Abstract

Multifactor Dimensionality Reduction (MDR) was developed to detect genetic polymorphisms that present an increased risk of disease. Cross-validation (CV) is an important part of the MDR algorithm, as it prevents over-fitting and allows the predictive ability of a model to be evaluated. CV is a computationally intensive step in the MDR algorithm. Traditionally, MDR has been implemented using 10-fold CV. In order to reduce computation time and therefore allow MDR analysis to be applied to larger datasets, we evaluated the possibility of eliminating or reducing the number of CV intervals used for analysis. We found that eliminating CV made final model selection impossible, but that reducing the number of CV intervals from ten to five caused no loss of power, thereby reducing the computation time of the algorithm by half. The validity of this reduction was confirmed with data from an Alzheimer's disease (AD) study.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alzheimer Disease / genetics*
  • Genetic Predisposition to Disease / genetics*
  • Genotype
  • Humans
  • Models, Genetic*
  • Multifactorial Inheritance
  • Polymorphism, Genetic
  • Risk Assessment / statistics & numerical data