Evaluation of tools for highly variable gene discovery from single-cell RNA-seq data

Brief Bioinform. 2019 Jul 19;20(4):1583-1589. doi: 10.1093/bib/bby011.

Abstract

Traditional RNA sequencing (RNA-seq) allows the detection of gene expression variations between two or more cell populations through differentially expressed gene (DEG) analysis. However, genes that contribute to cell-to-cell differences are not discoverable with RNA-seq because RNA-seq samples are obtained from a mixture of cells. Single-cell RNA-seq (scRNA-seq) allows the detection of gene expression in each cell. With scRNA-seq, highly variable gene (HVG) discovery allows the detection of genes that contribute strongly to cell-to-cell variation within a homogeneous cell population, such as a population of embryonic stem cells. This analysis is implemented in many software packages. In this study, we compare seven HVG methods from six software packages, including BASiCS, Brennecke, scLVM, scran, scVEGs and Seurat. Our results demonstrate that reproducibility in HVG analysis requires a larger sample size than DEG analysis. Discrepancies between methods and potential issues in these tools are discussed and recommendations are made.

Keywords: DEG analysis; highly variable gene; scRNA-seq; single-cell RNA seq; software.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Review

MeSH terms

  • Animals
  • Cluster Analysis
  • Computational Biology
  • Computer Simulation
  • Databases, Nucleic Acid / statistics & numerical data
  • Gene Expression Profiling / statistics & numerical data
  • Genetic Variation
  • Humans
  • RNA-Seq / statistics & numerical data*
  • Reproducibility of Results
  • Single-Cell Analysis / statistics & numerical data
  • Software*