PreCimp: Pre-collapsing imputation approach increases imputation accuracy of rare variants in terms of collapsed variables

Genet Epidemiol. 2017 Jan;41(1):41-50. doi: 10.1002/gepi.22020. Epub 2016 Nov 10.

Abstract

Imputation is widely used for obtaining information about rare variants. However, one issue concerning imputation is the low accuracy of imputed rare variants as the inaccurate imputed rare variants may distort the results of region-based association tests. Therefore, we developed a pre-collapsing imputation method (PreCimp) to improve the accuracy of imputation by using collapsed variables. Briefly, collapsed variables are generated using rare variants in the reference panel, and a new reference panel is constructed by inserting pre-collapsed variables into the original reference panel. Following imputation analysis provides the imputed genotypes of the collapsed variables. We demonstrated the performance of PreCimp on 5,349 genotyped samples using a Korean population specific reference panel including 848 samples of exome sequencing, Affymetrix 5.0, and exome chip. PreCimp outperformed a traditional post-collapsing method that collapses imputed variants after single rare variant imputation analysis. Compared with the results of post-collapsing method, PreCimp approach was shown to relatively increase imputation accuracy about 3.4-6.3% when dosage r2 is between 0.6 and 0.8, 10.9-16.1% when dosage r2 is between 0.4 and 0.6, and 21.4 ∼ 129.4% when dosage r2 is below 0.4.

Keywords: SNPs; genotyping; imputation; next generation sequencing; population genetics.

Publication types

  • Comparative Study
  • Multicenter Study

MeSH terms

  • Adult
  • Aged
  • Algorithms*
  • Biomarkers / analysis*
  • Case-Control Studies
  • Computational Biology / methods*
  • Diabetes Mellitus, Type 2 / genetics*
  • Exome / genetics*
  • Female
  • Genome-Wide Association Study / methods
  • Genotype
  • Humans
  • Male
  • Middle Aged
  • Oligonucleotide Array Sequence Analysis
  • Polymorphism, Single Nucleotide / genetics*
  • Prospective Studies

Substances

  • Biomarkers