The "grep" command but not FusionMap, FusionFinder or ChimeraScan captures the CIC-DUX4 fusion gene from whole transcriptome sequencing data on a small round cell tumor with t(4;19)(q35;q13)

PLoS One. 2014 Jun 20;9(6):e99439. doi: 10.1371/journal.pone.0099439. eCollection 2014.

Abstract

Whole transcriptome sequencing was used to study a small round cell tumor in which a t(4;19)(q35;q13) was part of the complex karyotype but where the initial reverse transcriptase PCR (RT-PCR) examination did not detect a CIC-DUX4 fusion transcript previously described as the crucial gene-level outcome of this specific translocation. The RNA sequencing data were analysed using the FusionMap, FusionFinder, and ChimeraScan programs which are specifically designed to identify fusion genes. FusionMap, FusionFinder, and ChimeraScan identified 1017, 102, and 101 fusion transcripts, respectively, but CIC-DUX4 was not among them. Since the RNA sequencing data are in the fastq text-based format, we searched the files using the "grep" command-line utility. The "grep" command searches the text for specific expressions and displays, by default, the lines where matches occur. The "specific expression" was a sequence of 20 nucleotides from the coding part of the last exon 20 of CIC (Reference Sequence: NM_015125.3) chosen since all the so far reported CIC breakpoints have occurred here. Fifteen chimeric CIC-DUX4 cDNA sequences were captured and the fusion between the CIC and DUX4 genes was mapped precisely. New primer combinations were constructed based on these findings and were used together with a polymerase suitable for amplification of GC-rich DNA templates to amplify CIC-DUX4 cDNA fragments which had the same fusion point found with "grep". In conclusion, FusionMap, FusionFinder, and ChimeraScan generated a plethora of fusion transcripts but did not detect the biologically important CIC-DUX4 chimeric transcript; they are generally useful but evidently suffer from imperfect both sensitivity and specificity. The "grep" command is an excellent tool to capture chimeric transcripts from RNA sequencing data when the pathological and/or cytogenetic information strongly indicates the presence of a specific fusion gene.

Publication types

  • Case Reports
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Adult
  • Female
  • Humans
  • In Situ Hybridization, Fluorescence
  • Karyotype
  • Oncogene Proteins, Fusion / biosynthesis
  • Oncogene Proteins, Fusion / genetics*
  • Sequence Analysis, RNA*
  • Soft Tissue Neoplasms / genetics*
  • Soft Tissue Neoplasms / pathology
  • Transcriptome / genetics
  • Translocation, Genetic / genetics*

Substances

  • CIC-DUX4 fusion protein, human
  • Oncogene Proteins, Fusion

Grants and funding

This work was supported by grants from the Norwegian Cancer Society. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.