The impact of data quality on the identification of complex disease genes: experience from the Family Blood Pressure Program

Eur J Hum Genet. 2006 Apr;14(4):469-77. doi: 10.1038/sj.ejhg.5201582.

Abstract

The application of genome-wide linkage scans to uncover susceptibility loci for complex diseases offers great promise for the risk assessment, treatment, and understanding of these diseases. However, for most published studies, linkage signals are typically modest and vary considerably from one study to another. The multicenter Family Blood Pressure Program has analyzed genome-wide linkage scans of over 12 000 individuals. Based on this experience, we developed a protocol for large linkage studies that reduces two sources of data error: pedigree structure and marker genotyping errors. We then used the linkage signals, before and after data cleaning, to illustrate the impact of missing and erroneous data. A comprehensive error-checking protocol is an important part of complex disease linkage studies and enhances gene mapping. The lack of significant and reproducible linkage findings across studies is, in part, due to data quality.

Publication types

  • Multicenter Study

MeSH terms

  • Blood Pressure / genetics*
  • Genetic Linkage*
  • Genetic Markers
  • Genetic Predisposition to Disease*
  • Humans
  • Hypertension / diagnosis
  • Hypertension / genetics*
  • Hypertension / physiopathology
  • Lod Score
  • Quantitative Trait Loci
  • Research Design*

Substances

  • Genetic Markers