Structure prediction and analysis of DNA transposon and LINE retrotransposon proteins

J Biol Chem. 2013 May 31;288(22):16127-38. doi: 10.1074/jbc.M113.451500. Epub 2013 Mar 25.

Abstract

Despite the considerable amount of research on transposable elements, no large-scale structural analyses of the TE proteome have been performed so far. We predicted the structures of hundreds of proteins from a representative set of DNA and LINE transposable elements and used the obtained structural data to provide the first general structural characterization of TE proteins and to estimate the frequency of TE domestication and horizontal transfer events. We show that 1) ORF1 and Gag proteins of retrotransposons contain high amounts of structural disorder; thus, despite their very low conservation, the presence of disordered regions and probably their chaperone function is conserved. 2) The distribution of SCOP classes in DNA transposons and LINEs indicates that the proteins of DNA transposons are more ancient, containing folds that already existed when the first cellular organisms appeared. 3) DNA transposon proteins have lower contact order than randomly selected reference proteins, indicating rapid folding, most likely to avoid protein aggregation. 4) Structure-based searches for TE homologs indicate that the overall frequency of TE domestication events is low, whereas we found a relatively high number of cases where horizontal transfer, frequently involving parasites, is the most likely explanation for the observed homology.

Keywords: Evolution; Gene Transposable Elements; Horizontal Transfer; Intrinsically Disordered Proteins; Protein Evolution; Protein Folding; RNA World; Transposon Domestication.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • DNA Transposable Elements*
  • Gene Products, gag / chemistry
  • Gene Products, gag / genetics*
  • Humans
  • Long Interspersed Nucleotide Elements*
  • Protein Folding*
  • Protein Structure, Tertiary
  • Sequence Analysis, Protein*
  • Structural Homology, Protein*

Substances

  • DNA Transposable Elements
  • Gene Products, gag