Functional transcriptomics in the post-ENCODE era

Genome Res. 2013 Dec;23(12):1961-73. doi: 10.1101/gr.161315.113. Epub 2013 Oct 30.

Abstract

The last decade has seen tremendous effort committed to the annotation of the human genome sequence, most notably perhaps in the form of the ENCODE project. One of the major findings of ENCODE, and other genome analysis projects, is that the human transcriptome is far larger and more complex than previously thought. This complexity manifests, for example, as alternative splicing within protein-coding genes, as well as in the discovery of thousands of long noncoding RNAs. It is also possible that significant numbers of human transcripts have not yet been described by annotation projects, while existing transcript models are frequently incomplete. The question as to what proportion of this complexity is truly functional remains open, however, and this ambiguity presents a serious challenge to genome scientists. In this article, we will discuss the current state of human transcriptome annotation, drawing on our experience gained in generating the GENCODE gene annotation set. We highlight the gaps in our knowledge of transcript functionality that remain, and consider the potential computational and experimental strategies that can be used to help close them. We propose that an understanding of the true overlap between transcriptional complexity and functionality will not be gained in the short term. However, significant steps toward obtaining this knowledge can now be taken by using an integrated strategy, combining all of the experimental resources at our disposal.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Alternative Splicing
  • Animals
  • Databases, Genetic
  • Evolution, Molecular
  • Genome, Human
  • Genomics / methods*
  • Humans
  • Molecular Sequence Annotation*
  • Proteins / genetics*
  • Proteomics
  • RNA, Long Noncoding
  • Sequence Alignment
  • Transcriptome*

Substances

  • Proteins
  • RNA, Long Noncoding