Identification of genotype errors

Methods Mol Biol. 2012:850:11-24. doi: 10.1007/978-1-61779-555-8_2.

Abstract

It has been documented that there exist some errors in most large genotype datasets and that an error rate of 1-2% is adequate to lead to the distortion of map distance as well as a false conclusion of linkage (Abecasis et al. Eur J Hum Genet 9(2):130-134, 2001), therefore one needs to ensure that the data are as clean as possible. On the other hand, the process of data cleaning is tedious and demands efforts and experience. O'Connell and Weeks implemented four error-checking algorithms in computer software called PedCheck. In this chapter, the four algorithms implemented in PedCheck are discussed with a focus on the genotype-elimination method. Furthermore, an example for four levels of error checking permitted by PedCheck is provided with the required input files. In addition, alternative algorithms implemented in other statistical computing programs are also briefly discussed.

MeSH terms

  • Algorithms*
  • Female
  • Genetic Techniques
  • Genotype*
  • Humans
  • Male
  • Pedigree