New families of human regulatory RNA structures identified by comparative analysis of vertebrate genomes

Genome Res. 2011 Nov;21(11):1929-43. doi: 10.1101/gr.112516.110. Epub 2011 Oct 12.

Abstract

Regulatory RNA structures are often members of families with multiple paralogous instances across the genome. Family members share functional and structural properties, which allow them to be studied as a whole, facilitating both bioinformatic and experimental characterization. We have developed a comparative method, EvoFam, for genome-wide identification of families of regulatory RNA structures, based on primary sequence and secondary structure similarity. We apply EvoFam to a 41-way genomic vertebrate alignment. Genome-wide, we identify 220 human, high-confidence families outside protein-coding regions comprising 725 individual structures, including 48 families with known structural RNA elements. Known families identified include both noncoding RNAs, e.g., miRNAs and the recently identified MALAT1/MEN β lincRNA family; and cis-regulatory structures, e.g., iron-responsive elements. We also identify tens of new families supported by strong evolutionary evidence and other statistical evidence, such as GO term enrichments. For some of these, detailed analysis has led to the formulation of specific functional hypotheses. Examples include two hypothesized auto-regulatory feedback mechanisms: one involving six long hairpins in the 3'-UTR of MAT2A, a key metabolic gene that produces the primary human methyl donor S-adenosylmethionine; the other involving a tRNA-like structure in the intron of the tRNA maturation gene POP1. We experimentally validate the predicted MAT2A structures. Finally, we identify potential new regulatory networks, including large families of short hairpins enriched in immunity-related genes, e.g., TNF, FOS, and CTLA4, which include known transcript destabilizing elements. Our findings exemplify the diversity of post-transcriptional regulation and provide a resource for further characterization of new regulatory mechanisms and families of noncoding RNAs.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 3' Untranslated Regions
  • Animals
  • Base Sequence
  • Conserved Sequence
  • Gene Expression Regulation
  • Genome*
  • Genomics*
  • Humans
  • Immunity / genetics
  • Methionine Adenosyltransferase / genetics
  • Molecular Sequence Data
  • Nucleic Acid Conformation
  • Phylogeny
  • Protein Biosynthesis
  • RNA Editing
  • RNA Precursors / metabolism
  • RNA Processing, Post-Transcriptional
  • RNA Stability
  • RNA, Messenger / metabolism
  • RNA, Transfer / chemistry
  • RNA, Transfer / metabolism
  • RNA, Untranslated / chemistry*
  • RNA, Untranslated / genetics
  • Regulatory Sequences, Ribonucleic Acid*
  • Sequence Alignment
  • Vertebrates / genetics*

Substances

  • 3' Untranslated Regions
  • RNA Precursors
  • RNA, Messenger
  • RNA, Untranslated
  • Regulatory Sequences, Ribonucleic Acid
  • RNA, Transfer
  • Methionine Adenosyltransferase