A pipeline for RNA-seq data processing and quality assessment

Bioinformatics. 2011 Mar 15;27(6):867-9. doi: 10.1093/bioinformatics/btr012. Epub 2011 Jan 13.

Abstract

Summary: We present an R based pipeline, ArrayExpressHTS, for pre-processing, expression estimation and data quality assessment of high-throughput sequencing transcriptional profiling (RNA-seq) datasets. The pipeline starts from raw sequence files and produces standard Bioconductor R objects containing gene or transcript measurements for downstream analysis along with web reports for data quality assessment. It may be run locally on a user's own computer or remotely on a distributed R-cloud farm at the European Bioinformatics Institute. It can be used to analyse user's own datasets or public RNA-seq datasets from the ArrayExpress Archive.

Availability: The R package is available at www.ebi.ac.uk/tools/rcloud with online documentation at www.ebi.ac.uk/Tools/rwiki/, also available as supplementary material.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Computational Biology / methods*
  • Databases, Genetic
  • Gene Expression Profiling / methods
  • Internet
  • RNA / analysis
  • Sequence Alignment
  • Sequence Analysis, RNA / methods*
  • Software*

Substances

  • RNA