RePS: a sequence assembler that masks exact repeats identified from the shotgun data

Genome Res. 2002 May;12(5):824-31. doi: 10.1101/gr.165102.

Abstract

We describe a sequence assembler, RePS (repeat-masked Phrap with scaffolding), that explicitly identifies exact 20mer repeats from the shotgun data and removes them prior to the assembly. The established software is used to compute meaningful error probabilities for each base. Clone-end-pairing information is used to construct scaffolds that order and orient the contigs. We show with real data for human and rice that reasonable assemblies are possible even at coverages of only 4x to 6x, despite having up to 42.2% in exact repeats.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Cloning, Molecular / methods
  • Contig Mapping / methods*
  • Humans
  • Oryza / genetics
  • Repetitive Sequences, Nucleic Acid / genetics*
  • Sequence Analysis, DNA / methods
  • Software*