SALSA: improved protein database searching by a new algorithm for assembly of sequence fragments into gapped alignments

Bioinformatics. 1998;14(10):839-45. doi: 10.1093/bioinformatics/14.10.839.

Abstract

Motivation: Optimal sequence alignment based on the Smith-Waterman algorithm is usually too computationally demanding to be practical for searching large sequence databases. Heuristic programs like FASTA and BLAST have been developed which run much faster, but at the expense of sensitivity.

Results: In an effort to approximate the sensitivity of an optimal alignment algorithm, a new algorithm has been devised for the computation of a gapped alignment of two sequences. After scanning for high-scoring words and extensions of these to form fragments of similarity, the algorithm uses dynamic programming to build an accurate alignment based on the fragments initially identified. The algorithm has been implemented in a program called SALSA and the performance has been evaluated on a set of test sequences. The sensitivity was found to be close to the Smith-Waterman algorithm, while the speed was similar to FASTA (ktup = 2).

Availability: Searches can be performed from the SALSA homepage at http://dna.uio.no/salsa/ using a wide range of databases. Source code and precompiled executables are also available.

Contact: torbjorn.rognes@labmed.uio.no

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Amino Acid Sequence
  • Animals
  • Computational Biology
  • Databases, Factual*
  • Deoxyribonuclease (Pyrimidine Dimer)*
  • Endodeoxyribonucleases / genetics
  • Escherichia coli Proteins*
  • Evaluation Studies as Topic
  • Globins / genetics
  • Molecular Sequence Data
  • Peptide Fragments / genetics
  • Proteins / genetics*
  • Sensitivity and Specificity
  • Sequence Alignment / statistics & numerical data*
  • Sequence Homology, Amino Acid
  • Software

Substances

  • Escherichia coli Proteins
  • Peptide Fragments
  • Proteins
  • Globins
  • Endodeoxyribonucleases
  • Deoxyribonuclease (Pyrimidine Dimer)
  • NTH protein, E coli