The incomplete perfect phylogeny haplotype problem

J Bioinform Comput Biol. 2005 Apr;3(2):359-84. doi: 10.1142/s0219720005001090.

Abstract

The problem of resolving genotypes into haplotypes, under the perfect phylogeny model, has been under intensive study recently. All studies so far handled missing data entries in a heuristic manner. We prove that the perfect phylogeny haplotype problem is NP-complete when some of the data entries are missing, even when the phylogeny is rooted. We define a biologically motivated probabilistic model for genotype generation and for the way missing data occur. Under this model, we provide an algorithm, which takes an expected polynomial time. In tests on simulated data, our algorithm quickly resolves the genotypes under high rates of missing entries.

Publication types

  • Evaluation Study
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms*
  • Chromosome Mapping / methods*
  • DNA Mutational Analysis / methods*
  • Genetic Variation / genetics
  • Haplotypes / genetics*
  • Models, Genetic*
  • Models, Statistical
  • Phylogeny*
  • Polymorphism, Single Nucleotide / genetics*
  • Sequence Analysis, DNA / methods*
  • Sequence Homology, Nucleic Acid