Comparative analysis of human protein-coding and noncoding RNAs between brain and 10 mixed cell lines by RNA-Seq

PLoS One. 2011;6(11):e28318. doi: 10.1371/journal.pone.0028318. Epub 2011 Nov 30.

Abstract

In their expression process, different genes can generate diverse functional products, including various protein-coding or noncoding RNAs. Here, we investigated the protein-coding capacities and the expression levels of their isoforms for human known genes, the conservation and disease association of long noncoding RNAs (ncRNAs) with two transcriptome sequencing datasets from human brain tissues and 10 mixed cell lines. Comparative analysis revealed that about two-thirds of the genes expressed between brain and cell lines are the same, but less than one-third of their isoforms are identical. Besides those genes specially expressed in brain and cell lines, about 66% of genes expressed in common encoded different isoforms. Moreover, most genes dominantly expressed one isoform and some genes only generated protein-coding (or noncoding) RNAs in one sample but not in another. We found 282 human genes could encode both protein-coding and noncoding RNAs through alternative splicing in the two samples. We also identified more than 1,000 long ncRNAs, and most of those long ncRNAs contain conserved elements across either 46 vertebrates or 33 placental mammals or 10 primates. Further analysis showed that some long ncRNAs differentially expressed in human breast cancer or lung cancer, several of those differentially expressed long ncRNAs were validated by RT-PCR. In addition, those validated differentially expressed long ncRNAs were found significantly correlated with certain breast cancer or lung cancer related genes, indicating the important biological relevance between long ncRNAs and human cancers. Our findings reveal that the differences of gene expression profile between samples mainly result from the expressed gene isoforms, and highlight the importance of studying genes at the isoform level for completely illustrating the intricate transcriptome.

Publication types

  • Comparative Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Base Sequence
  • Brain / metabolism*
  • Cell Line
  • Conserved Sequence
  • Disease / genetics
  • Exons / genetics
  • Female
  • Gene Expression Regulation
  • Humans
  • Male
  • Open Reading Frames / genetics*
  • Protein Isoforms / genetics
  • Protein Isoforms / metabolism
  • RNA, Messenger / genetics
  • RNA, Messenger / metabolism
  • RNA, Untranslated / genetics*
  • Reproducibility of Results
  • Reverse Transcriptase Polymerase Chain Reaction
  • Sequence Analysis, RNA / methods*

Substances

  • Protein Isoforms
  • RNA, Messenger
  • RNA, Untranslated