SAW: an efficient and accurate data analysis workflow for Stereo-seq spatial transcriptomics

GigaByte. 2024 Feb 20:2024:gigabyte111. doi: 10.46471/gigabyte.111. eCollection 2024.

Abstract

The basic analysis steps of spatial transcriptomics require obtaining gene expression information from both space and cells. The existing tools for these analyses incur performance issues when dealing with large datasets. These issues involve computationally intensive spatial localization, RNA genome alignment, and excessive memory usage in large chip scenarios. These problems affect the applicability and efficiency of the analysis. Here, a high-performance and accurate spatial transcriptomics data analysis workflow, called Stereo-seq Analysis Workflow (SAW), was developed for the Stereo-seq technology developed at BGI. SAW includes mRNA spatial position reconstruction, genome alignment, gene expression matrix generation, and clustering. The workflow outputs files in a universal format for subsequent personalized analysis. The execution time for the entire analysis is ∼148 min with 1 GB reads 1 × 1 cm chip test data, 1.8 times faster than with an unoptimized workflow.

Grants and funding

This research was supported by BGI and the National Key R&D Program of China (2022YFC3400400).