A novel bioinformatics pipeline for identification and characterization of fusion transcripts in breast cancer and normal cell lines

Nucleic Acids Res. 2011 Aug;39(15):e100. doi: 10.1093/nar/gkr362. Epub 2011 May 27.

Abstract

SnowShoes-FTD, developed for fusion transcript detection in paired-end mRNA-Seq data, employs multiple steps of false positive filtering to nominate fusion transcripts with near 100% confidence. Unique features include: (i) identification of multiple fusion isoforms from two gene partners; (ii) prediction of genomic rearrangements; (iii) identification of exon fusion boundaries; (iv) generation of a 5'-3' fusion spanning sequence for PCR validation; and (v) prediction of the protein sequences, including frame shift and amino acid insertions. We applied SnowShoes-FTD to identify 50 fusion candidates in 22 breast cancer and 9 non-transformed cell lines. Five additional fusion candidates with two isoforms were confirmed. In all, 30 of 55 fusion candidates had in-frame protein products. No fusion transcripts were detected in non-transformed cells. Consideration of the possible functions of a subset of predicted fusion proteins suggests several potentially important functions in transformation, including a possible new mechanism for overexpression of ERBB2 in a HER-positive cell line. The source code of SnowShoes-FTD is provided in two formats: one configured to run on the Sun Grid Engine for parallelization, and the other formatted to run on a single LINUX node. Executables in PERL are available for download from our web site: http://mayoresearch.mayo.edu/mayo/research/biostat/stand-alone-packages.cfm.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / genetics*
  • Breast Neoplasms / metabolism
  • Cell Line
  • Cell Line, Tumor
  • Computational Biology / methods
  • Female
  • Gene Fusion*
  • Humans
  • Mutant Chimeric Proteins / genetics*
  • Mutant Chimeric Proteins / metabolism
  • Mutation
  • Promoter Regions, Genetic
  • RNA, Messenger / analysis
  • RNA, Messenger / chemistry*
  • Receptor, ErbB-2 / genetics
  • Receptor, ErbB-2 / metabolism
  • Sequence Alignment
  • Sequence Analysis, RNA
  • Software*

Substances

  • Mutant Chimeric Proteins
  • RNA, Messenger
  • Receptor, ErbB-2