Natural Language Processing for Adjudication of Heart Failure Hospitalizations in a Multi-Center Clinical Trial

medRxiv [Preprint]. 2023 Aug 23:2023.08.17.23294234. doi: 10.1101/2023.08.17.23294234.

Abstract

Background: The gold standard for outcome adjudication in clinical trials is chart review by a physician clinical events committee (CEC), which requires substantial time and expertise. Automated adjudication by natural language processing (NLP) may offer a more resource-efficient alternative. We previously showed that the Community Care Cohort Project (C3PO) NLP model adjudicates heart failure (HF) hospitalizations accurately within one healthcare system.

Methods: This study externally validated the C3PO NLP model against CEC adjudication in the INVESTED trial. INVESTED compared influenza vaccination formulations in 5260 patients with cardiovascular disease at 157 North American sites. A central CEC adjudicated the cause of hospitalizations from medical records. We applied the C3PO NLP model to medical records from 4060 INVESTED hospitalizations and evaluated agreement between the NLP and final consensus CEC HF adjudications. We then fine-tuned the C3PO NLP model (C3PO+INVESTED) and trained a de novo model using half the INVESTED hospitalizations, and evaluated these models in the other half. NLP performance was benchmarked to CEC reviewer inter-rater reproducibility.

Results: 1074 hospitalizations (26%) were adjudicated as HF by the CEC. There was high agreement between the C3PO NLP and CEC HF adjudications (agreement 87%, kappa statistic 0.69). C3PO NLP model sensitivity was 94% and specificity was 84%. The fine-tuned C3PO and de novo NLP models demonstrated agreement of 93% and kappa of 0.82 and 0.83, respectively. CEC reviewer inter-rater reproducibility was 94% (kappa 0.85).

Conclusion: Our NLP model developed within a single healthcare system accurately identified HF events relative to the gold-standard CEC in an external multi-center clinical trial. Fine-tuning the model improved agreement and approximated human reproducibility. NLP may improve the efficiency of future multi-center clinical trials by accurately identifying clinical events at scale.

Publication types

  • Preprint