A method is described for estimating the distribution and hence testing the statistical significance of sequence similarity scores obtained during a data-bank search. Maximum-likelihood is used to fit a model to the scores, avoiding any costly simulation of random sequences. The method is applied in detail to the Smith-Waterman algorithm when gaps are allowed, and is shown to give results very similar to those obtained by simulation.