Analysis and classification of circular proteins in CyBase

Biopolymers. 2010;94(5):584-91. doi: 10.1002/bip.21424.

Abstract

CyBase is a database dedicated to the study of the sequences and three-dimensional structures of ribosomally synthesized, backbone cyclized proteins, and their synthetic variants. This article describes CyBase data and tools that are useful in the analysis of circular proteins. Circular proteins have now been discovered in organisms from all kingdoms of life, and given the current rate of discovery they could soon number in the thousands. Presently CyBase manages 427 protein sequences, 106 nucleic acid sequences, and 49 protein three-dimensional structures from 44 different species. Circular proteins are grouped into distinct classes according to their origin and sequence similarities. These classes include trypsin inhibitors, bacterial proteins, mushroom toxins, cyclotides, and cyclic defensins from primates. Several protein classification types are used in CyBase to designate proteins extracted from natural resources (wild type and precursor) or engineered (modified wild type, grafted, mutant, cyclic permutant, and acyclic permutant). CyBase has tools for the analysis of mass spectrum fingerprints of cyclic peptides, and assists in the discovery of new circular proteins. Some of the developments detailed here have been made specifically for the largest class of circular proteins, the cyclotides, but could be adapted for other classes of cyclic proteins. The cyclotide-specific tools include two-dimensional representations of domains and alternative displays of alignments for precursor sequences. This alignment prompted us to propose a revision of the cydclotide precursor organization, in which the repeated regions now include a small C-terminal region, which appears to have a significant role in the biosynthesis of mature cyclotides.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Amino Acid Sequence
  • Cyclotides / analysis
  • Cyclotides / classification
  • Cyclotides / genetics
  • Databases, Protein*
  • Mass Spectrometry / methods
  • Molecular Sequence Data
  • Peptides, Cyclic / analysis*
  • Peptides, Cyclic / classification*
  • Peptides, Cyclic / genetics
  • Plant Proteins / analysis
  • Plant Proteins / classification
  • Plant Proteins / genetics
  • Protein Engineering / methods
  • Protein Precursors / analysis
  • Protein Precursors / classification
  • Protein Precursors / genetics
  • Protein Structure, Tertiary*
  • Sequence Alignment

Substances

  • Cyclotides
  • Peptides, Cyclic
  • Plant Proteins
  • Protein Precursors