Coding SNPs as intrinsic markers for sample tracking in large-scale transcriptome studies

Biotechniques. 2012 Jun;52(6):386-8. doi: 10.2144/0000113879.

Abstract

Large-scale transcriptome profiling in clinical studies often involves assaying multiple samples of a patient to monitor disease progression, treatment effect, and host response in multiple tissues. Such profiling is prone to human error, which often results in mislabeled samples. Here, we present a method to detect mislabeled sample outliers using coding single nucleotide polymorphisms (cSNPs) specifically designed on the microarray and demonstrate that the mislabeled samples can be efficiently identified by either simple clustering of allele-specific expression scores or Mahalanobis distance-based outlier detection method. Based on our results, we recommend the incorporation of cSNPs into future transcriptome array designs as intrinsic markers for sample tracking.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Base Sequence
  • Cluster Analysis
  • Gene Expression Profiling / methods*
  • Genetic Markers*
  • Humans
  • Leukocytes / physiology
  • Molecular Sequence Data
  • Oligonucleotide Array Sequence Analysis / methods*
  • Polymorphism, Single Nucleotide*
  • Transcriptome*

Substances

  • Genetic Markers