Statistical analysis of the 5' untranslated region of human mRNA using "Oligo-Capped" cDNA libraries

Genomics. 2000 Mar 15;64(3):286-97. doi: 10.1006/geno.2000.6076.

Abstract

We constructed 34 types of human "full-length enriched" and "5'-end enriched" cDNA libraries based on the "Oligo-Capping" method. We randomly picked and sequenced 10,000 clones from these libraries. BLAST analysis showed that about 50% of the cDNAs were identical to known genes. Among them, we selected 954 species of cDNA that should represent the entire sequence from the mRNA start sites. Compared with previously reported sequences, they were on average 45 bp longer in the 5'-end. Using these cDNA data, we statistically analyzed the sequence features of the 5'UTR. The average length of the 5'UTR was 125 bp, and there was little correlation with the corresponding mRNA length (correlation coefficient = 0.26). Of the 954 species of 5'UTR, 459 contained no in-frame terminator codon, which is against the common belief. Two hundred seventy-eight species contained at least one ATG codon upstream of the initiator ATG codon. We identified 569 upstream ATGs, in total, 63% of which adequately satisfied Kozak's criteria. These findings are contrary to the typical translation initiation model, which states that translation is initiated from the "first" ATG codon.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • 5' Untranslated Regions*
  • Data Interpretation, Statistical
  • Gene Library
  • Humans
  • Molecular Sequence Data
  • Oligonucleotides / chemistry
  • Polymerase Chain Reaction
  • RNA Caps / chemistry*
  • Sequence Analysis, RNA

Substances

  • 5' Untranslated Regions
  • Oligonucleotides
  • RNA Caps

Associated data

  • GENBANK/AU076340
  • GENBANK/AU076341
  • GENBANK/AU076342
  • GENBANK/AU076343
  • GENBANK/AU076344
  • GENBANK/AU076345
  • GENBANK/AU076346
  • GENBANK/AU076347
  • GENBANK/AU076348
  • GENBANK/AU076349
  • GENBANK/AU076350
  • GENBANK/AU076351
  • GENBANK/AU076352
  • GENBANK/AU076353
  • GENBANK/AU076354
  • GENBANK/AU076355
  • GENBANK/AU076356
  • GENBANK/AU076357
  • GENBANK/AU076358
  • GENBANK/AU076359
  • GENBANK/AU076360
  • GENBANK/AU076361
  • GENBANK/AU076362
  • GENBANK/AU076363
  • GENBANK/AU076364
  • GENBANK/AU076365
  • GENBANK/AU076396
  • GENBANK/AU076397
  • GENBANK/AU076398
  • GENBANK/AU076399