Identification of probable genotyping errors by consideration of haplotypes

Tim Becker; Ruta Valentonyte; Peter J P Croucher; Konstantin Strauch; Stefan Schreiber; Jochen Hampe; Michael Knapp

doi:10.1038/sj.ejhg.5201565

Identification of probable genotyping errors by consideration of haplotypes

Eur J Hum Genet. 2006 Apr;14(4):450-8. doi: 10.1038/sj.ejhg.5201565.

Authors

Tim Becker¹, Ruta Valentonyte, Peter J P Croucher, Konstantin Strauch, Stefan Schreiber, Jochen Hampe, Michael Knapp

Affiliation

¹ Institute for Medical Biometry, Informatics and Epidemiology, University of Bonn, Sigmund-Freud-Strasse 25, D-53105 Bonn, Germany.

PMID: 16435001
DOI: 10.1038/sj.ejhg.5201565

Abstract

Undetected genotyping errors pose a problem in genetic epidemiological studies, as they may invalidate statistical analysis or reduce its power. Haplotype analysis requires an improved standard of the data, because a haplotype can be inferred correctly only if the genotypes of all its markers are correct. Here, we present a method that identifies probable genotyping errors in trio samples with the help of the estimated haplotype frequency distribution of the sample. If the likelihood of the most likely haplotype explanation depends strongly on just one genotype, in the sense that setting the genotype to be missing leads to a much more likely haplotype explanation, this genotype is considered as a potential genotyping error. We describe a method that systematically searches the whole data set for such potential errors. Based on the haplotype distribution of a real data set, we carry out a simulation study to estimate the sensitivity and specificity of the method. In addition, we apply our approach to the real data set itself. Potentially erroneous genotypes are re-determined via sequencing. The results of both the simulation study and of the application to the real data set show that a considerable proportion of true genotyping errors is detected and that the number of false-positive signals is acceptable. We conclude that it is indeed possible to identify probable genotyping errors by considering haplotypes. The method described here will be part of the next release of our FAMHAP software.

Publication types

Research Support, Non-U.S. Gov't

MeSH terms

Computer Simulation*
Gene Frequency
Genetic Markers / genetics
Genotype
Haplotypes*
Humans
Models, Genetic*
Predictive Value of Tests
Research Design*

Substances

Genetic Markers