Comprehensive assessment of mRNA isoform detection methods for long-read sequencing data

Nat Commun. 2024 May 10;15(1):3972. doi: 10.1038/s41467-024-48117-3.

Abstract

The advancement of Long-Read Sequencing (LRS) techniques has significantly increased the length of sequencing to several kilobases, thereby facilitating the identification of alternative splicing events and isoform expressions. Recently, numerous computational tools for isoform detection using long-read sequencing data have been developed. Nevertheless, there remains a deficiency in comparative studies that systemically evaluate the performance of these tools, which are implemented with different algorithms, under various simulations that encompass potential influencing factors. In this study, we conducted a benchmark analysis of thirteen methods implemented in nine tools capable of identifying isoform structures from long-read RNA-seq data. We evaluated their performances using simulated data, which represented diverse sequencing platforms generated by an in-house simulator, RNA sequins (sequencing spike-ins) data, as well as experimental data. Our findings demonstrate IsoQuant as a highly effective tool for isoform detection with LRS, with Bambu and StringTie2 also exhibiting strong performance. These results offer valuable guidance for future research on alternative splicing analysis and the ongoing improvement of tools for isoform detection using LRS data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Alternative Splicing*
  • Computational Biology / methods
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Protein Isoforms / genetics
  • RNA Isoforms / genetics
  • RNA, Messenger* / analysis
  • RNA, Messenger* / genetics
  • Sequence Analysis, RNA* / methods
  • Software

Substances

  • RNA, Messenger
  • RNA Isoforms
  • Protein Isoforms