An application of conditional logistic regression and multifactor dimensionality reduction for detecting gene-gene interactions on risk of myocardial infarction: the importance of model validation

BMC Bioinformatics. 2004 Apr 30:5:49. doi: 10.1186/1471-2105-5-49.

Abstract

Background: To examine interactions among the angiotensin converting enzyme (ACE) insertion/deletion, plasminogen activator inhibitor-1 (PAI-1) 4G/5G, and tissue plasminogen activator (t-PA) insertion/deletion gene polymorphisms on risk of myocardial infarction using data from 343 matched case-control pairs from the Physicians Health Study. We examined the data using both conditional logistic regression and the multifactor dimensionality reduction (MDR) method. One advantage of the MDR method is that it provides an internal prediction error for validation. We summarize our use of this internal prediction error for model validation.

Results: The overall results for the two methods were consistent, with both suggesting an interaction between the ACE I/D and PAI-1 4G/5G polymorphisms. However, using ten-fold cross validation, the 46% prediction error for the final MDR model was not significantly lower than that expected by chance.

Conclusions: The significant interaction initially observed does not validate and may represent a type I error. As data-driven analytic methods continue to be developed and used to examine complex genetic interactions, it will become increasingly important to stress model validation in order to ensure that significant effects represent true relationships rather than chance findings.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.
  • Validation Study

MeSH terms

  • Gene Expression Regulation / genetics*
  • Genetic Predisposition to Disease / genetics*
  • Genotype
  • Humans
  • Logistic Models*
  • Myocardial Infarction / genetics*
  • Risk Assessment / statistics & numerical data