Identifying QT prolongation from ECG impressions using a general-purpose Natural Language Processor

Int J Med Inform. 2009 Apr;78 Suppl 1(Suppl 1):S34-42. doi: 10.1016/j.ijmedinf.2008.09.001. Epub 2008 Oct 19.

Abstract

Objective: Typically detected via electrocardiograms (ECGs), QT interval prolongation is a known risk factor for sudden cardiac death. Since medications can promote or exacerbate the condition, detection of QT interval prolongation is important for clinical decision support. We investigated the accuracy of natural language processing (NLP) for identifying QT prolongation from cardiologist-generated, free-text ECG impressions compared to corrected QT (QTc) thresholds reported by ECG machines.

Methods: After integrating negation detection to a locally developed natural language processor, the KnowledgeMap concept identifier, we evaluated NLP-based detection of QT prolongation compared to the calculated QTc on a set of 44,318 ECGs obtained from hospitalized patients. We also created a string query using regular expressions to identify QT prolongation. We calculated sensitivity and specificity of the methods using manual physician review of the cardiologist-generated reports as the gold standard. To investigate causes of "false positive" calculated QTc, we manually reviewed randomly selected ECGs with a long calculated QTc but no mention of QT prolongation. Separately, we validated the performance of the negation detection algorithm on 5000 manually categorized ECG phrases for any medical concept (not limited to QT prolongation) prior to developing the NLP query for QT prolongation.

Results: The NLP query for QT prolongation correctly identified 2364 of 2373 ECGs with QT prolongation with a sensitivity of 0.996 and a positive predictive value of 1.000. There were no false positives. The regular expression query had a sensitivity of 0.999 and positive predictive value of 0.982. In contrast, the positive predictive value of common QTc thresholds derived from ECG machines was 0.07-0.25 with corresponding sensitivities of 0.994-0.046. The negation detection algorithm had a recall of 0.973 and precision of 0.982 for 10,490 concepts found within ECG impressions.

Conclusion: NLP and regular expression queries of cardiologists' ECG interpretations can more effectively identify QT prolongation than the automated QTc intervals reported by ECG machines. Future clinical decision support could employ NLP queries to detect QTc prolongation and other reported ECG abnormalities.

Publication types

  • Research Support, N.I.H., Extramural
  • Validation Study

MeSH terms

  • Algorithms
  • Electrocardiography*
  • Humans
  • Natural Language Processing*
  • Sensitivity and Specificity
  • Systems Integration