Algorithms for mapping short degenerate and weighted sequences to a reference genome

Int J Comput Biol Drug Des. 2009;2(4):385-97. doi: 10.1504/IJCBDD.2009.030768. Epub 2009 Jan 4.

Abstract

Novel high-throughput (Deep) sequencing technologies have redefined the way genome sequencing is performed. They are able to produce millions of short sequences in a single experiment and with a much lower cost than previous methods. In this paper, we address the problem of efficiently mapping and classifying millions of short sequences to a reference genome, based on whether they occur exactly once in the genome or not, and by taking into consideration probability scores. In particular, we design algorithms for Massive Exact and Approximate Pattern Matching of short degenerate and weighted sequences, derived from Deep sequencing technologies, to a reference genome.

MeSH terms

  • Algorithms*
  • Base Sequence
  • Chromosome Mapping / methods*
  • Computational Biology
  • Genome*
  • High-Throughput Screening Assays
  • Sequence Analysis, DNA / methods*