Towards optimally multiplexed applications of universal arrays

J Comput Biol. 2004;11(2-3):476-92. doi: 10.1089/1066527041410373.

Abstract

We study a design and optimization problem that occurs, for example, when single nucleotide polymorphisms (SNPs) are to be genotyped using a universal DNA tag array. The problem of optimizing the universal array to avoid disruptive cross-hybridization between universal components of the system was addressed in previous work. Cross-hybridization can, however, also occur assay specifically, due to unwanted complementarity involving assay-specific components. Here we examine the problem of identifying the most economic experimental configuration of the assay-specific components that avoids cross-hybridization. Our formalization translates this problem into the problem of covering the vertices of one side of a bipartite graph by a minimum number of balanced subgraphs of maximum degree 1. We show that the general problem is NP-complete. However, in the real biological setting, the vertices that need to be covered have degrees bounded by d. We exploit this restriction and develop an O(d)-approximation algorithm for the problem. We also give an O(d)-approximation for a variant of the problem in which the covering subgraphs are required to be vertex disjoint. In addition, we propose a stochastic model for the input data and use it to prove a lower bound on the cover size. We complement our theoretical analysis by implementing two heuristic approaches and testing their performance on synthetic data as well as on simulated SNP data.

Publication types

  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.

MeSH terms

  • Algorithms
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Genotype
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data*
  • Polymerase Chain Reaction
  • Polymorphism, Single Nucleotide