A fast and efficient algorithm for mapping short sequences to a reference genome

Adv Exp Med Biol. 2010:680:399-403. doi: 10.1007/978-1-4419-5913-3_45.

Abstract

Novel high-throughput (Deep) sequencing technology methods have redefined the way genome sequencing is performed. They are able to produce tens of millions of short sequences (reads) in a single experiment and with a much lower cost than previous sequencing methods. In this paper, we present a new algorithm for addressing the problem of efficiently mapping millions of short reads to a reference genome. In particular, we define and solve the Massive Approximate Pattern Matching problem for mapping short sequences to a reference genome.

MeSH terms

  • Algorithms*
  • Animals
  • Chromosome Mapping / statistics & numerical data*
  • Computational Biology
  • Genomics / statistics & numerical data*
  • Mice
  • Pattern Recognition, Automated / statistics & numerical data
  • RNA / genetics
  • Sequence Alignment / statistics & numerical data*
  • X Chromosome / genetics

Substances

  • RNA