Association analysis of self-reported outcomes with a validated subset

Sedigheh Mirzaei; José M Martínez; Eric J Chow; Kirsten K Ness; Melissa M Hudson; Gregory T Armstrong; Yutaka Yasui

doi:10.1002/sim.9976

Association analysis of self-reported outcomes with a validated subset

Stat Med. 2024 Feb 20;43(4):642-655. doi: 10.1002/sim.9976. Epub 2023 Dec 13.

Authors

Sedigheh Mirzaei¹, José M Martínez², Eric J Chow^{3

4}, Kirsten K Ness⁵, Melissa M Hudson^{5

6}, Gregory T Armstrong^{5

6}, Yutaka Yasui⁵

Affiliations

¹ Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, Tennessee, USA.
² Public Health Research Group, University of Alicante, Alicante, Spain.
³ Clinical Research Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.
⁴ Department of Pediatrics, Division of Hematology/Oncology, University of Washington School of Medicine, Seattle, Washington, USA.
⁵ Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Seattle, Washington, USA.
⁶ Department of Oncology, St. Jude Children's Research Hospital, Memphis, Tennessee, USA.

PMID: 38088465
PMCID: PMC10872253 (available on 2024-08-20)
DOI: 10.1002/sim.9976

Abstract

In health-science research, outcomes ascertained through surveys and interviews are subject to potential bias with respect to the true outcome status, which is only ascertainable with clinical and laboratory assessment. This measurement error may lead to biased inference when evaluating associations between exposures and outcomes of interest. Here, we consider a cohort study in which the outcome of interest is ascertained via questionnaire, subject to imperfect ascertainment, but where a subset of participants also have a clinically assessed, validated outcome available. This presents a methodological opportunity to address potential bias. Specifically, we constructed the likelihood in two parts, one using the validated subset and the other using a subset without validation. This work expands on that proposed by Pepe and enables inference with standard statistical software. Weighted generalized linear model estimates for our method and maximum likelihood estimates (MLE) for Pepe's method were computed, and the statistical inference was based on the standard large-sample likelihood theory. We compare the finite sample performance of two approaches through Monte Carlo simulations. This methodological work was motivated by a large cohort study of long-term childhood cancer survivors, allowing us to provide a relevant application example where we examined the association between clinical factors and chronic health conditions.

Keywords: Childhood cancer survivors; Generalized linear model; Late effects of cancer therapy; Observational study; Patient-reported outcomes; Recall bias..

MeSH terms

Bias
Child
Cohort Studies
Humans
Patient Reported Outcome Measures*
Self Report
Surveys and Questionnaires

Abstract

MeSH terms

Grants and funding