Sequence analysis using logic regression

Genet Epidemiol. 2001:21 Suppl 1:S626-31. doi: 10.1002/gepi.2001.21.s1.s626.

Abstract

Logic Regression is a new adaptive regression methodology that attempts to construct predictors as Boolean combinations of (binary) covariates. In this paper we use this algorithm to deal with single-nucleotide polymorphism (SNP) sequence data. The predictors that are found are interpretable as risk factors of the disease. Significance of these risk factors is assessed using techniques like cross-validation, permutation tests, and independent test sets. These model selection techniques remain valid when data is dependent, as is the case for the family data used here. In our analysis of the Genetic Analysis Workshop 12 data we identify the exact locations of mutations on gene 1 and gene 6 and a number of mutations on gene 2 that are associated with the affected status, without selecting any false positives.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Algorithms
  • Chromosome Mapping / statistics & numerical data*
  • DNA Mutational Analysis
  • Female
  • Genetic Predisposition to Disease / genetics*
  • Humans
  • Logistic Models*
  • Male
  • Models, Genetic*
  • Polymorphism, Single Nucleotide / genetics*
  • Quantitative Trait, Heritable