Quod erat demonstrandum? The mystery of experimental validation of apparently erroneous computational analyses of protein sequences

Genome Biol. 2001;2(12):RESEARCH0051. doi: 10.1186/gb-2001-2-12-research0051. Epub 2001 Nov 13.

Abstract

Background: Computational predictions are critical for directing the experimental study of protein functions. Therefore it is paradoxical when an apparently erroneous computational prediction seems to be supported by experiment.

Results: We analyzed six cases where application of novel or conventional computational methods for protein sequence and structure analysis led to non-trivial predictions that were subsequently supported by direct experiments. We show that, on all six occasions, the original prediction was unjustified, and in at least three cases, an alternative, well-supported computational prediction, incompatible with the original one, could be derived. The most unusual cases involved the identification of an archaeal cysteinyl-tRNA synthetase, a dihydropteroate synthase and a thymidylate synthase, for which experimental verifications of apparently erroneous computational predictions were reported. Using sequence-profile analysis, multiple alignment and secondary-structure prediction, we have identified the unique archaeal 'cysteinyl-tRNA synthetase' as a homolog of extracellular polygalactosaminidases, and the 'dihydropteroate synthase' as a member of the beta-lactamase-like superfamily of metal-dependent hydrolases.

Conclusions: In each of the analyzed cases, the original computational predictions could be refuted and, in some instances, alternative strongly supported predictions were obtained. The nature of the experimental evidence that appears to support these predictions remains an open question. Some of these experiments might signify discovery of extremely unusual forms of the respective enzymes, whereas the results of others could be due to artifacts.

MeSH terms

  • Acetyltransferases / chemistry
  • Acetyltransferases / physiology
  • Activating Transcription Factor 2
  • Amino Acid Sequence
  • Amino Acyl-tRNA Synthetases / chemistry
  • Amino Acyl-tRNA Synthetases / physiology
  • Arabidopsis Proteins*
  • Archaeal Proteins / chemistry
  • Archaeal Proteins / physiology
  • Artifacts
  • Basic Helix-Loop-Helix Transcription Factors
  • Computational Biology*
  • Cyclic AMP Response Element-Binding Protein / chemistry
  • Cyclic AMP Response Element-Binding Protein / physiology
  • Dihydropteroate Synthase / chemistry
  • Dihydropteroate Synthase / physiology
  • Forecasting
  • Histone Acetyltransferases
  • Humans
  • Molecular Sequence Data
  • Phytochrome / chemistry
  • Phytochrome / physiology
  • Plant Proteins / chemistry
  • Plant Proteins / physiology
  • Plant Viral Movement Proteins
  • Protein Structure, Tertiary
  • Proteins / chemistry*
  • Proteins / physiology*
  • Saccharomyces cerevisiae Proteins*
  • Sequence Alignment
  • Sequence Analysis, Protein*
  • Thymidylate Synthase / chemistry
  • Thymidylate Synthase / physiology
  • Transcription Factors / chemistry
  • Transcription Factors / physiology
  • Viral Proteins / chemistry
  • Viral Proteins / physiology

Substances

  • Activating Transcription Factor 2
  • Arabidopsis Proteins
  • Archaeal Proteins
  • Basic Helix-Loop-Helix Transcription Factors
  • Cyclic AMP Response Element-Binding Protein
  • PIF3 protein, Arabidopsis
  • Plant Proteins
  • Plant Viral Movement Proteins
  • Proteins
  • Saccharomyces cerevisiae Proteins
  • Transcription Factors
  • Viral Proteins
  • Phytochrome
  • Thymidylate Synthase
  • Acetyltransferases
  • Histone Acetyltransferases
  • Dihydropteroate Synthase
  • Amino Acyl-tRNA Synthetases
  • cysteinyl-tRNA synthetase