Genome-wide association of breast cancer: composite likelihood with imputed genotypes

Eur J Hum Genet. 2011 Feb;19(2):194-9. doi: 10.1038/ejhg.2010.157. Epub 2010 Oct 20.

Abstract

We describe composite likelihood-based analysis of a genome-wide breast cancer case-control sample from the Cancer Genetic Markers of Susceptibility project. We determine 14 380 genome regions of fixed size on a linkage disequilibrium (LD) map, which delimit comparable levels of LD. Although the numbers of single-nucleotide polymorphisms (SNPs) are highly variable, each region contains an average of ∼35 SNPs and an average of ∼69 after imputation of missing genotypes. Composite likelihood association mapping yields a single P-value for each region, established by a permutation test, along with a maximum likelihood disease location, SE and information weight. For single SNP analysis, the nominal P-value for the most significant SNP (msSNP) requires substantial correction given the number of SNPs in the region. Therefore, imputing genotypes may not always be advantageous for the msSNP test, in contrast to composite likelihood. For the region containing FGFR2 (a known breast cancer gene) the largest χ(2) is obtained under composite likelihood with imputed genotypes (χ(2)(2) increases from 20.6 to 22.7), and compares with a single SNP-based χ(2)(2) of 19.9 after correction. Imputation of additional genotypes in this region reduces the size of the 95% confidence interval for location of the disease gene by ∼40%. Among the highest ranked regions, SNPs in the NTSR1 gene would be worthy of examination in additional samples. Meta-analysis, which combines weighted evidence from composite likelihood in different samples, and refines putative disease locations, is facilitated through defining fixed regions on an underlying LD map.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / epidemiology
  • Breast Neoplasms / genetics*
  • Case-Control Studies
  • Chromosome Mapping
  • Confidence Intervals
  • Female
  • Genetic Predisposition to Disease
  • Genome, Human*
  • Genome-Wide Association Study / methods*
  • Genotype
  • Humans
  • Likelihood Functions
  • Linkage Disequilibrium
  • Polymorphism, Single Nucleotide / genetics
  • Quality Control
  • Receptor, Fibroblast Growth Factor, Type 2 / genetics
  • Software

Substances

  • Receptor, Fibroblast Growth Factor, Type 2