An efficient automated computer vision based technique for detection of three dimensional structural motifs in proteins

J Biomol Struct Dyn. 1992 Feb;9(4):769-89. doi: 10.1080/07391102.1992.10507955.

Abstract

As the number of available three dimensional coordinates of proteins increases, it is now recognized that proteins from different families and topologies are constructed from independent motifs. Detection of specific structural motifs within proteins aids in understanding their role and the mechanism of their operation. To aid in identification and use of these motifs it has become necessary to develop efficient methods for systematic scanning of structural databases. To date, methods of structural protein comparison suffer from at least one of the following limitations: (1) are not fully automated (require human intervention), (2) are limited to relatively similar structures, (3) are constrained to linear alignments of the structures, (4) are sensitive to insertions, deletions or gaps in the sequences or (5) are very time consuming. We present a method to overcome the above limitations. The method discovers and ranks every piece of structural similarity between the structures compared, thus allowing the simultaneous detection of real 3-D motifs in different domains, between domains, in active sites, surfaces etc. The method uses the Geometric Hashing Paradigm which is an efficient technique originally developed for Computer Vision. The algorithm exploits the geometrical constraints of rigid objects, it is especially geared towards recognition of partial structures in rigid objects belonging to large data bases and is straightforwardly parallelizable. Computer Vision techniques are for the first time applied to molecular structure comparison, resulting in an efficient, fully automated tool. The method has been tested in a number of cases, including comparisons of the haemoglobins, immunoglobulins, serine proteinases, calcium binding proteins, DNA binding proteins and others. In all examples our results were equivalent to the published results from previous methods and in some cases additional structural information was obtained by our method.

Publication types

  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Amino Acid Sequence
  • Bacterial Proteins*
  • Calcium-Binding Proteins / chemistry
  • Computer Simulation*
  • DNA-Binding Proteins / chemistry*
  • Immunoglobulins / chemistry
  • Matched-Pair Analysis
  • Molecular Sequence Data
  • Protein Conformation*
  • Repressor Proteins / chemistry
  • Transcription Factors / chemistry
  • Viral Proteins / chemistry
  • Viral Regulatory and Accessory Proteins

Substances

  • Bacterial Proteins
  • Calcium-Binding Proteins
  • DNA-Binding Proteins
  • Immunoglobulins
  • Repressor Proteins
  • TRPR protein, E coli
  • Transcription Factors
  • Viral Proteins
  • Viral Regulatory and Accessory Proteins
  • phage repressor proteins