Development and evaluation of a text analytics algorithm for automated application of national COVID-19 shielding criteria in rheumatology patients

Meghna Jani; Ghada Alfattni; Maksim Belousov; Lynn Laidlaw; Yuanyuan Zhang; Michael Cheng; Karim Webb; Robyn Hamilton; Andrew S Kanter; William G Dixon; Goran Nenadic

doi:10.1136/ard-2024-225544

Development and evaluation of a text analytics algorithm for automated application of national COVID-19 shielding criteria in rheumatology patients

Ann Rheum Dis. 2024 Apr 4:ard-2024-225544. doi: 10.1136/ard-2024-225544. Online ahead of print.

Authors

Meghna Jani^{1

2

3}, Ghada Alfattni^{4

5}, Maksim Belousov⁴, Lynn Laidlaw⁶, Yuanyuan Zhang⁶, Michael Cheng⁷, Karim Webb⁷, Robyn Hamilton⁷, Andrew S Kanter⁸, William G Dixon^{6

2

3}, Goran Nenadic⁴

Affiliations

¹ Centre for Epidemiology Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, UK meghna.jani@manchester.ac.uk.
² Department of Rheumatology, Northern Care Alliance NHS Foundation Trust Salford Care Organisation, Salford, UK.
³ NIHR Manchester Biomedical Research Centre, Manchester University NHS Foundation Trust, Manchester Academic Health Science Centre, Manchester, UK.
⁴ Department of Computer Science, The University of Manchester, Manchester, UK.
⁵ Department of Computer Science, Jamoum University College, Umm Al-Qura University, Makkah, Saudi Arabia.
⁶ Centre for Epidemiology Versus Arthritis, Centre for Musculoskeletal Research, The University of Manchester, Manchester, UK.
⁷ Department of Business Intelligence, Northern Care Alliance NHS Foundation Trust, Salford Care Organisation, Salford, UK.
⁸ Department of Biomedical Informatics, Columbia University, New York, New York, USA.

PMID: 38575324
DOI: 10.1136/ard-2024-225544

Abstract

Introduction: At the beginning of the COVID-19 pandemic, the UK's Scientific Committee issued extreme social distancing measures, termed 'shielding', aimed at a subpopulation deemed extremely clinically vulnerable to infection. National guidance for risk stratification was based on patients' age, comorbidities and immunosuppressive therapies, including biologics that are not captured in primary care records. This process required considerable clinician time to manually review outpatient letters. Our aim was to develop and evaluate an automated shielding algorithm by text-mining outpatient letter diagnoses and medications, reducing the need for future manual review.

Methods: Rheumatology outpatient letters from a large UK foundation trust were retrieved. Free-text diagnoses were processed using Intelligent Medical Objects software (Concept Tagger), which used interface terminology for each condition mapped to Systematized Medical Nomenclature for Medicine-Clinical Terminology (SNOMED-CT) codes. We developed the Medication Concept Recognition tool (Named Entity Recognition) to retrieve medications' type, dose, duration and status (active/past) at the time of the letter. Age, diagnosis and medication variables were then combined to calculate a shielding score based on the most recent letter. The algorithm's performance was evaluated using clinical review as the gold standard. The time taken to deploy the developed algorithm on a larger patient subset was measured.

Results: In total, 5942 free-text diagnoses were extracted and mapped to SNOMED-CT, with 13 665 free-text medications (n=803 patients). The automated algorithm demonstrated a sensitivity of 80% (95% CI: 75%, 85%) and specificity of 92% (95% CI: 90%, 94%). Positive likelihood ratio was 10 (95% CI: 8, 14), negative likelihood ratio was 0.21 (95% CI: 0.16, 0.28) and F1 score was 0.81. Evaluation of mismatches revealed that the algorithm performed correctly against the gold standard in most cases. The developed algorithm was then deployed on records from an additional 15 865 patients, which took 18 hours for data extraction and 1 hour to deploy.

Discussion: An automated algorithm for risk stratification has several advantages including reducing clinician time for manual review to allow more time for direct care, improving efficiency and increasing transparency in individual patient communication. It has the potential to be adapted for future public health initiatives that require prompt automated review of hospital outpatient letters.

Keywords: Biological Therapy; Covid-19; Epidemiology.