Conditional estimation of local pooled dispersion parameter in small-sample RNA-Seq data improves differential expression test

J Bioinform Comput Biol. 2016 Oct;14(5):1644006. doi: 10.1142/S0219720016440066. Epub 2016 Sep 15.

Abstract

High throughput sequencing technology in transcriptomics studies contribute to the understanding of gene regulation mechanism and its cellular function, but also increases a need for accurate statistical methods to assess quantitative differences between experiments. Many methods have been developed to account for the specifics of count data: non-normality, a dependence of the variance on the mean, and small sample size. Among them, the small number of samples in typical experiments is still a challenge. Here we present a method for differential analysis of count data, using conditional estimation of local pooled dispersion parameters. A comprehensive evaluation of our proposed method in the aspect of differential gene expression analysis using both simulated and real data sets shows that the proposed method is more powerful than other existing methods while controlling the false discovery rates. By introducing conditional estimation of local pooled dispersion parameters, we successfully overcome the limitation of small power and enable a powerful quantitative analysis focused on differential expression test with the small number of samples.

Keywords: Differential expression test; RNA-Seq analysis; local pooled dispersion estimation; small sample data.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Computer Simulation
  • Data Interpretation, Statistical
  • Databases, Genetic
  • Female
  • Gene Expression Regulation
  • HapMap Project
  • Humans
  • Male
  • Models, Genetic
  • Sample Size
  • Sequence Analysis, RNA / methods*
  • Sequence Analysis, RNA / statistics & numerical data