Automating resequencing-based detection of insertion-deletion polymorphisms

Nat Genet. 2006 Dec;38(12):1457-62. doi: 10.1038/ng1925. Epub 2006 Nov 19.

Abstract

Structural and insertion-deletion (indel) variants have received considerable recent attention, partly because of their phenotypic consequences. Among these variants, the most common are small indels ( approximately 1-30 bp). Identifying and genotyping indels using sequence traces obtained from diploid samples requires extensive manual review, which makes large-scale studies inconvenient. We report a new algorithm, implemented in available software (PolyPhred version 6.0), to help automate detection and genotyping of indels from sequence traces. The algorithm identifies heterozygous individuals, which permits the discovery of low-frequency indels. It finds 80% of all indel polymorphisms with almost no false positives and finds 97% with a false discovery rate of 10%. Additionally, genotyping accuracy exceeds 99%, and it correctly infers indel length in 96% of the cases. Using this approach, we identify indels in the HapMap ENCODE regions, providing the first report of these polymorphisms in this data set.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Algorithms*
  • DNA Transposable Elements
  • Genetic Techniques
  • Heterozygote
  • Humans
  • Mutation
  • Polymorphism, Genetic*
  • Polymorphism, Single Nucleotide
  • Sequence Analysis, DNA
  • Sequence Deletion

Substances

  • DNA Transposable Elements