Evaluation of Algorithms Using Automated Health Plan Data to Identify Breast Cancer Recurrences

Cancer Epidemiol Biomarkers Prev. 2024 Mar 1;33(3):355-364. doi: 10.1158/1055-9965.EPI-23-0782.

Abstract

Background: We updated algorithms to identify breast cancer recurrences from administrative data, extending previously developed methods.

Methods: In this validation study, we evaluated pairs of breast cancer recurrence algorithms (vs. individual algorithms) to identify recurrences. We generated algorithm combinations that categorized discordant algorithm results as no recurrence [High Specificity and PPV (positive predictive value) Combination] or recurrence (High Sensitivity Combination). We compared individual and combined algorithm results to manually abstracted recurrence outcomes from a sample of 600 people with incident stage I-IIIA breast cancer diagnosed between 2004 and 2015. We used Cox regression to evaluate risk factors associated with age- and stage-adjusted recurrence rates using different recurrence definitions, weighted by inverse sampling probabilities.

Results: Among 600 people, we identified 117 recurrences using the High Specificity and PPV Combination, 505 using the High Sensitivity Combination, and 118 using manual abstraction. The High Specificity and PPV Combination had good specificity [98%, 95% confidence interval (CI): 97-99] and PPV (72%, 95% CI: 63-80) but modest sensitivity (64%, 95% CI: 44-80). The High Sensitivity Combination had good sensitivity (80%, 95% CI: 49-94) and specificity (83%, 95% CI: 80-86) but low PPV (29%, 95% CI: 25-34). Recurrence rates using combined algorithms were similar in magnitude for most risk factors.

Conclusions: By combining algorithms, we identified breast cancer recurrences with greater PPV than individual algorithms, without additional review of discordant records.

Impact: Researchers should consider tradeoffs between accuracy and manual chart abstraction resources when using previously developed algorithms. We provided guidance for future studies that use breast cancer recurrence algorithms with or without supplemental manual chart abstraction.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Algorithms
  • Breast Neoplasms* / diagnosis
  • Breast Neoplasms* / epidemiology
  • Female
  • Humans
  • Predictive Value of Tests
  • Risk Factors
  • Sensitivity and Specificity