Qualitative analysis of manual annotations of clinical text with SNOMED CT

Jose Antonio Miñarro-Giménez; Catalina Martínez-Costa; Daniel Karlsson; Stefan Schulz; Kirstine Rosenbeck Gøeg

doi:10.1371/journal.pone.0209547

Qualitative analysis of manual annotations of clinical text with SNOMED CT

PLoS One. 2018 Dec 27;13(12):e0209547. doi: 10.1371/journal.pone.0209547. eCollection 2018.

Authors

Jose Antonio Miñarro-Giménez¹, Catalina Martínez-Costa¹, Daniel Karlsson², Stefan Schulz¹, Kirstine Rosenbeck Gøeg³

Affiliations

¹ Institute for Medical Informatics, Statistics and Documentation, Medical University of Graz, Graz, Austria.
² Department of Biomedical Engineering, Linköping University, Linköping, Sweden.
³ Department of Health Science and Technology, Aalborg University, Aalborg, Denmark.

Abstract

SNOMED CT provides about 300,000 codes with fine-grained concept definitions to support interoperability of health data. Coding clinical texts with medical terminologies it is not a trivial task and is prone to disagreements between coders. We conducted a qualitative analysis to identify sources of disagreements on an annotation experiment which used a subset of SNOMED CT with some restrictions. A corpus of 20 English clinical text fragments from diverse origins and languages was annotated independently by two domain medically trained annotators following a specific annotation guideline. By following this guideline, the annotators had to assign sets of SNOMED CT codes to noun phrases, together with concept and term coverage ratings. Then, the annotations were manually examined against a reference standard to determine sources of disagreements. Five categories were identified. In our results, the most frequent cause of inter-annotator disagreement was related to human issues. In several cases disagreements revealed gaps in the annotation guidelines and lack of training of annotators. The reminder issues can be influenced by some SNOMED CT features.

Publication types

Research Support, N.I.H., Extramural
Research Support, Non-U.S. Gov't

MeSH terms

Data Curation*
Evaluation Studies as Topic
Guidelines as Topic
Humans
Systematized Nomenclature of Medicine*

Grants and funding

U54 LM008748/LM/NLM NIH HHS/United States