Consistent errors in first strand cDNA due to random hexamer mispriming

PLoS One. 2013 Dec 30;8(12):e85583. doi: 10.1371/journal.pone.0085583. eCollection 2013.

Abstract

Priming of random hexamers in cDNA synthesis is known to show sequence bias, but in addition it has been suggested recently that mismatches in random hexamer priming could be a cause of mismatches between the original RNA fragment and observed sequence reads. To explore random hexamer mispriming as a potential source of these errors, we analyzed two independently generated RNA-seq datasets of synthetic ERCC spikes for which the reference is known. First strand cDNA synthesized by random hexamer priming on RNA showed consistent position and nucleotide-specific mismatch errors in the first seven nucleotides. The mismatch errors found in both datasets are consistent in distribution and thermodynamically stable mismatches are more common. This strongly indicates that RNA-DNA mispriming of specific random hexamers causes these errors. Due to their consistency and specificity, mispriming errors can have profound implications for downstream applications if not dealt with properly.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • DNA Primers / chemistry*
  • DNA, Complementary / biosynthesis*
  • DNA, Complementary / chemistry
  • Datasets as Topic
  • RNA, Plant / chemistry*
  • RNA, Plant / genetics
  • Reverse Transcription*
  • Taraxacum / chemistry*
  • Taraxacum / genetics

Substances

  • DNA Primers
  • DNA, Complementary
  • RNA, Plant

Associated data

  • GEO/GSM517062
  • SRA/SRR954526

Grants and funding

This study was funded by NWO Netherlands (Netherlands Organization for scientific research) (www.nwo.nl) grant number NWO-ALW 820.01.025). The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.