Identification of specific sequence motifs in the upstream region of 242 human miRNA genes

Comput Biol Chem. 2007 Jun;31(3):207-14. doi: 10.1016/j.compbiolchem.2007.03.011. Epub 2007 Mar 31.

Abstract

We have identified novel over-represented and conserved motifs in the upstream regions of human and mouse miRNA stem-loop sequences by means of a new bioinformatic processing regimen. We observed sequence conservation -500 bp upstream in 189 human and mouse miRNAs declining with increasing distance from their putative miRNA stem-loop origin. We also found relatively GC-rich regions having more than 50% of guanine+cytosine (G+C) content at about -30 and -170 bp relative to human miRNA stem-loop sequence origin. To further identify specific sequence motifs that might be involved in the transcriptional regulation of miRNA precursors, we first searched 500 bp upstream sequences of 194 non-redundant human miRNA stem-loop sequences for frequently occurring motifs 5-15 bp long. We then found the comparable occurrences of the 20 most frequent motifs in the 2000 bp upstream regions of 242 human and 290 mouse miRNAs. The significantly reduced frequency of occurrence of all 20 motifs in the regions 2000 bp upstream of 23,570 human RefSeq genes demonstrated that these motifs were specific to the upstream miRNA sequences. The most frequently observed motif M1 (GTGCTTMTAGTGCAG), with a MEME E-value of 3.8e-57 was distributed within 500 bp upstream of stem-loop sequences and was also miRNA-specific. We suggest that these over-represented motif sites are good candidates for experimentally testing miRNA expression as well as possible interaction with regulatory factors.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Base Composition
  • Base Sequence
  • Computational Biology / methods*
  • Consensus Sequence / genetics*
  • CpG Islands
  • Databases, Genetic
  • GC Rich Sequence
  • Humans
  • Mice
  • MicroRNAs / genetics*
  • Response Elements

Substances

  • MicroRNAs