Tools for the analysis of high-dimensional single-cell RNA sequencing data

Nat Rev Nephrol. 2020 Jul;16(7):408-421. doi: 10.1038/s41581-020-0262-0. Epub 2020 Mar 27.

Abstract

Breakthroughs in the development of high-throughput technologies for profiling transcriptomes at the single-cell level have helped biologists to understand the heterogeneity of cell populations, disease states and developmental lineages. However, these single-cell RNA sequencing (scRNA-seq) technologies generate an extraordinary amount of data, which creates analysis and interpretation challenges. Additionally, scRNA-seq datasets often contain technical sources of noise owing to incomplete RNA capture, PCR amplification biases and/or batch effects specific to the patient or sample. If not addressed, this technical noise can bias the analysis and interpretation of the data. In response to these challenges, a suite of computational tools has been developed to process, analyse and visualize scRNA-seq datasets. Although the specific steps of any given scRNA-seq analysis might differ depending on the biological questions being asked, a core workflow is used in most analyses. Typically, raw sequencing reads are processed into a gene expression matrix that is then normalized and scaled to remove technical noise. Next, cells are grouped according to similarities in their patterns of gene expression, which can be summarized in two or three dimensions for visualization on a scatterplot. These data can then be further analysed to provide an in-depth view of the cell types or developmental trajectories in the sample of interest.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Data Analysis*
  • Data Visualization*
  • Datasets as Topic*
  • Electronic Data Processing
  • Gene Expression Profiling
  • High-Throughput Nucleotide Sequencing
  • Humans
  • RNA-Seq*
  • Sequence Analysis, RNA
  • Single-Cell Analysis*
  • Software
  • Workflow