Identifying in-trans process associated genes in breast cancer by integrated analysis of copy number and expression data

PLoS One. 2013;8(1):e53014. doi: 10.1371/journal.pone.0053014. Epub 2013 Jan 30.

Abstract

Genomic copy number alterations are common in cancer. Finding the genes causally implicated in oncogenesis is challenging because the gain or loss of a chromosomal region may affect a few key driver genes and many passengers. Integrative analyses have opened new vistas for addressing this issue. One approach is to identify genes with frequent copy number alterations and corresponding changes in expression. Several methods also analyse effects of transcriptional changes on known pathways. Here, we propose a method that analyses in-cis correlated genes for evidence of in-trans association to biological processes, with no bias towards processes of a particular type or function. The method aims to identify cis-regulated genes for which the expression correlation to other genes provides further evidence of a network-perturbing role in cancer. The proposed unsupervised approach involves a sequence of statistical tests to systematically narrow down the list of relevant genes, based on integrative analysis of copy number and gene expression data. A novel adjustment method handles confounding effects of co-occurring copy number aberrations, potentially a large source of false positives in such studies. Applying the method to whole-genome copy number and expression data from 100 primary breast carcinomas, 6373 genes were identified as commonly aberrant, 578 were highly in-cis correlated, and 56 were in addition associated in-trans to biological processes. Among these in-trans process associated and cis-correlated (iPAC) genes, 28% have previously been reported as breast cancer associated, and 64% as cancer associated. By combining statistical evidence from three separate subanalyses that focus respectively on copy number, gene expression and the combination of the two, the proposed method identifies several known and novel cancer driver candidates. Validation in an independent data set supports the conclusion that the method identifies genes implicated in cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Breast Neoplasms / genetics*
  • Breast Neoplasms / pathology
  • Chromosome Aberrations
  • Comparative Genomic Hybridization
  • DNA Copy Number Variations / genetics*
  • Female
  • Gene Dosage*
  • Gene Expression Regulation, Neoplastic
  • Genome, Human*
  • Humans
  • Oligonucleotide Array Sequence Analysis

Grants and funding

(1) Norwegian Research Council (Grant no 193387/V50, Understanding breast cancer genomics); (2) Norwegian Cancer Society (Grant no 138296 – PR-2008-0108, Exploring The Systems Biology of Breast Cancer); (3) South-Eastern Norway Regional Health Authority (Project no 2011042, OSBREAC: Towards personalized therapy for breast cancer); (4) K.G. Jebsen Center for Breast Cancer Research. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.