Association analysis of self-reported outcomes with a validated subset

Stat Med. 2024 Feb 20;43(4):642-655. doi: 10.1002/sim.9976. Epub 2023 Dec 13.

Abstract

In health-science research, outcomes ascertained through surveys and interviews are subject to potential bias with respect to the true outcome status, which is only ascertainable with clinical and laboratory assessment. This measurement error may lead to biased inference when evaluating associations between exposures and outcomes of interest. Here, we consider a cohort study in which the outcome of interest is ascertained via questionnaire, subject to imperfect ascertainment, but where a subset of participants also have a clinically assessed, validated outcome available. This presents a methodological opportunity to address potential bias. Specifically, we constructed the likelihood in two parts, one using the validated subset and the other using a subset without validation. This work expands on that proposed by Pepe and enables inference with standard statistical software. Weighted generalized linear model estimates for our method and maximum likelihood estimates (MLE) for Pepe's method were computed, and the statistical inference was based on the standard large-sample likelihood theory. We compare the finite sample performance of two approaches through Monte Carlo simulations. This methodological work was motivated by a large cohort study of long-term childhood cancer survivors, allowing us to provide a relevant application example where we examined the association between clinical factors and chronic health conditions.

Keywords: Childhood cancer survivors; Generalized linear model; Late effects of cancer therapy; Observational study; Patient-reported outcomes; Recall bias..

MeSH terms

  • Bias
  • Child
  • Cohort Studies
  • Humans
  • Patient Reported Outcome Measures*
  • Self Report
  • Surveys and Questionnaires