Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease

Peter J Castaldi; Abdullah Abood; Charles R Farber; Gloria M Sheynkman

doi:10.1093/hmg/ddac196

Bridging the splicing gap in human genetics with long-read RNA sequencing: finding the protein isoform drivers of disease

Hum Mol Genet. 2022 Oct 20;31(R1):R123-R136. doi: 10.1093/hmg/ddac196.

Authors

Peter J Castaldi^{1

2}, Abdullah Abood^{3

4}, Charles R Farber^{3

4

5}, Gloria M Sheynkman^{3

4

6

7}

Affiliations

¹ Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA.
² Division of General Medicine and Primary Care, Department of Medicine, Brigham and Women's Hospital, Boston, MA 02115, USA.
³ Center for Public Health Genomics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA.
⁴ Department of Biochemistry and Molecular Genetics, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA.
⁵ Department of Public Health Sciences, School of Medicine, University of Virginia, Charlottesville, VA 22903, USA.
⁶ Department of Molecular Physiology and Biological Physics, University of Virginia, Charlottesville, VA 22903, USA.
⁷ UVA Comprehensive Cancer Center, University of Virginia, Charlottesville, VA 22903, USA.

Abstract

Aberrant splicing underlies many human diseases, including cancer, cardiovascular diseases and neurological disorders. Genome-wide mapping of splicing quantitative trait loci (sQTLs) has shown that genetic regulation of alternative splicing is widespread. However, identification of the corresponding isoform or protein products associated with disease-associated sQTLs is challenging with short-read RNA-seq, which cannot precisely characterize full-length transcript isoforms. Furthermore, contemporary sQTL interpretation often relies on reference transcript annotations, which are incomplete. Solutions to these issues may be found through integration of newly emerging long-read sequencing technologies. Long-read sequencing offers the capability to sequence full-length mRNA transcripts and, in some cases, to link sQTLs to transcript isoforms containing disease-relevant protein alterations. Here, we provide an overview of sQTL mapping approaches, the use of long-read sequencing to characterize sQTL effects on isoforms, the linkage of RNA isoforms to protein-level functions and comment on future directions in the field. Based on recent progress, long-read RNA sequencing promises to be part of the human disease genetics toolkit to discover and treat protein isoforms causing rare and complex diseases.

Publication types

Review
Research Support, N.I.H., Extramural

MeSH terms

Human Genetics*
Humans
Protein Isoforms / genetics
Protein Isoforms / metabolism
RNA Isoforms* / genetics
RNA, Messenger / genetics
Sequence Analysis, RNA

Substances

Protein Isoforms
RNA Isoforms
RNA, Messenger

Abstract

Publication types

MeSH terms

Substances

Grants and funding