Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease

Hum Mol Genet. 2022 Oct 20;31(R1):R123-R136. doi: 10.1093/hmg/ddac196.

Abstract

Aberrant splicing underlies many human diseases, including cancer, cardiovascular diseases and neurological disorders. Genome-wide mapping of splicing quantitative trait loci (sQTLs) has shown that genetic regulation of alternative splicing is widespread. However, identification of the corresponding isoform or protein products associated with disease-associated sQTLs is challenging with short-read RNA-seq, which cannot precisely characterize full-length transcript isoforms. Furthermore, contemporary sQTL interpretation often relies on reference transcript annotations, which are incomplete. Solutions to these issues may be found through integration of newly emerging long-read sequencing technologies. Long-read sequencing offers the capability to sequence full-length mRNA transcripts and, in some cases, to link sQTLs to transcript isoforms containing disease-relevant protein alterations. Here, we provide an overview of sQTL mapping approaches, the use of long-read sequencing to characterize sQTL effects on isoforms, the linkage of RNA isoforms to protein-level functions and comment on future directions in the field. Based on recent progress, long-read RNA sequencing promises to be part of the human disease genetics toolkit to discover and treat protein isoforms causing rare and complex diseases.

Publication types

  • Review
  • Research Support, N.I.H., Extramural

MeSH terms

  • Human Genetics*
  • Humans
  • Protein Isoforms / genetics
  • Protein Isoforms / metabolism
  • RNA Isoforms* / genetics
  • RNA, Messenger / genetics
  • Sequence Analysis, RNA

Substances

  • Protein Isoforms
  • RNA Isoforms
  • RNA, Messenger