Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads

Bioinformatics. 2014 Apr 15;30(8):1064-1072. doi: 10.1093/bioinformatics/btt767. Epub 2014 Jan 2.

Abstract

Motivation: Methods for detecting somatic genome rearrangements in tumours using next-generation sequencing are vital in cancer genomics. Available algorithms use one or more sources of evidence, such as read depth, paired-end reads or split reads to predict structural variants. However, the problem remains challenging due to the significant computational burden and high false-positive or false-negative rates.

Results: In this article, we present Socrates (SOft Clip re-alignment To idEntify Structural variants), a highly efficient and effective method for detecting genomic rearrangements in tumours that uses only split-read data. Socrates has single-nucleotide resolution, identifies micro-homologies and untemplated sequence at break points, has high sensitivity and high specificity and takes advantage of parallelism for efficient use of resources. We demonstrate using simulated and real data that Socrates performs well compared with a number of existing structural variant detection tools.

Availability and implementation: Socrates is released as open source and available from http://bioinf.wehi.edu.au/socrates CONTACT: papenfuss@wehi.edu.au Supplementary information: Supplementary data are available at Bioinformatics online.

MeSH terms

  • Algorithms
  • Computational Biology
  • Genomics / methods
  • High-Throughput Nucleotide Sequencing / methods
  • Humans
  • Neoplasms / genetics*
  • Sequence Analysis, DNA / methods
  • Software*