Automated identification of single nucleotide polymorphisms from sequencing data

Proc IEEE Comput Soc Bioinform Conf. 2002:1:87-93.

Abstract

Single nucleotide polymorphisms (SNPs) provide abundant information about genetic variation. Large scale discovery of high frequency SNPs is being undertaken using various methods. However, the publicly available SNP data are not always accurate, and therefore should be verified. If only a particular gene locus is concerned,locus-specific polymerase chain reaction amplification may be useful. Problem of this method is that the secondary peak has to be measured. We have analyzed trace data from conventional sequencing equipment and found an applicable rule to discern SNPs from noise. We have developed software that integrates this function to automatically identify SNPs. The software works accurately for high quality sequences and also can detect SNPs in low quality sequences. Further, it can determine allele frequency, display this information as a bar graph and assign corresponding nucleotide combinations. It is very useful for identifying de novo SNPs in a DNA fragment of interest.

Publication types

  • Evaluation Study

MeSH terms

  • Algorithms*
  • Chromosome Mapping / methods*
  • DNA Mutational Analysis / methods*
  • Pattern Recognition, Automated / methods
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Alignment / methods*
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid
  • Software*