Evaluating Model Specification When Using the Parametric G-Formula in the Presence of Censoring

Am J Epidemiol. 2023 Nov 3;192(11):1887-1895. doi: 10.1093/aje/kwad143.

Abstract

The noniterative conditional expectation (NICE) parametric g-formula can be used to estimate the causal effect of sustained treatment strategies. In addition to identifiability conditions, the validity of the NICE parametric g-formula generally requires the correct specification of models for time-varying outcomes, treatments, and confounders at each follow-up time point. An informal approach for evaluating model specification is to compare the observed distributions of the outcome, treatments, and confounders with their parametric g-formula estimates under the "natural course." In the presence of loss to follow-up, however, the observed and natural-course risks can differ even if the identifiability conditions of the parametric g-formula hold and there is no model misspecification. Here, we describe 2 approaches for evaluating model specification when using the parametric g-formula in the presence of censoring: 1) comparing factual risks estimated by the g-formula with nonparametric Kaplan-Meier estimates and 2) comparing natural-course risks estimated by inverse probability weighting with those estimated by the g-formula. We also describe how to correctly compute natural-course estimates of time-varying covariate means when using a computationally efficient g-formula algorithm. We evaluate the proposed methods via simulation and implement them to estimate the effects of dietary interventions in 2 cohort studies.

Keywords: censoring; inverse probability weighting; model misspecification; noniterative conditional expectation parametric g-formula.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Causality
  • Cohort Studies
  • Computer Simulation
  • Humans
  • Kaplan-Meier Estimate
  • Models, Statistical*
  • Probability