Missing phenotype data imputation in pedigree data analysis

Brooke L Fridley; Mariza de Andrade

doi:10.1002/gepi.20261

Missing phenotype data imputation in pedigree data analysis

Genet Epidemiol. 2008 Jan;32(1):52-60. doi: 10.1002/gepi.20261.

Authors

Brooke L Fridley¹, Mariza de Andrade

Affiliation

¹ Department of Health Sciences Research Mayo Clinic College of Medicine, Division of Biostatistics, Rochester, Minnesota 55905, USA. fridley.brooke@mayo.edu

PMID: 17685457
DOI: 10.1002/gepi.20261

Abstract

Mapping complex traits or phenotypes with small genetic effects, whose phenotypes may be modulated by temporal trends in families are challenging. Detailed and accurate data must be available on families, whether or not the data were collected over time. Missing data complicate matters in pedigree analysis, especially in the case of a longitudinal pedigree analysis. Because most analytical methods developed for the analysis of longitudinal pedigree data require no missing data, the researcher is left with the option of dropping those cases (individuals) with missing data from the analysis or imputing values for the missing data. We present the use of data augmentation within Bayesian polygenic and longitudinal polygenic models to produce k complete datasets. The data augmentation, or imputation step of the Markov chain Monte Carlo, takes into account the observed familial information and the observed subject information available at other time points. These k complete datasets can then be used to fit single time point or longitudinal pedigree models. By producing a set of k complete datasets and thus k sets of parameter estimates, the total variance associated with an estimate can be partitioned into a within-imputation and a between-imputation component. The method is illustrated using the Genetic Analysis Workshop simulated data.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Bayes Theorem
Computer Simulation
Genetic Linkage*
Genotype
Humans
Longitudinal Studies
Models, Genetic*
Models, Statistical
Monte Carlo Method
Multifactorial Inheritance / genetics*
Pedigree
Phenotype*
Statistics as Topic

Grants and funding

GM31575/GM/NIGMS NIH HHS/United States