Support Vector Machine classifier for estrogen receptor positive and negative early-onset breast cancer

PLoS One. 2013 Jul 19;8(7):e68606. doi: 10.1371/journal.pone.0068606. Print 2013.

Abstract

Two major breast cancer sub-types are defined by the expression of estrogen receptors on tumour cells. Cancers with large numbers of receptors are termed estrogen receptor positive and those with few are estrogen receptor negative. Using genome-wide single nucleotide polymorphism genotype data for a sample of early-onset breast cancer patients we developed a Support Vector Machine (SVM) classifier from 200 germline variants associated with estrogen receptor status (p<0.0005). Using a linear kernel Support Vector Machine, we achieved classification accuracy exceeding 93%. The model indicates that polygenic variation in more than 100 genes is likely to underlie the estrogen receptor phenotype in early-onset breast cancer. Functional classification of the genes involved identifies enrichment of functions linked to the immune system, which is consistent with the current understanding of the biological role of estrogen receptors in breast cancer.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Age of Onset
  • Biomarkers, Tumor / genetics
  • Breast Neoplasms / diagnosis*
  • Breast Neoplasms / genetics*
  • Female
  • Gene Expression Profiling
  • Humans
  • Molecular Sequence Annotation
  • Polymorphism, Single Nucleotide
  • ROC Curve
  • Receptors, Estrogen / genetics*
  • Receptors, Estrogen / metabolism
  • Support Vector Machine*

Substances

  • Biomarkers, Tumor
  • Receptors, Estrogen