Assessment of Data Quality Variability across Two EHR Systems through a Case Study of Post-Surgical Complications

Sunyang Fu; Andrew Wen; Gavin M Schaeferle; Patrick M Wilson; Gabriel Demuth; Xiaoyang Ruan; Sijia Liu; Curtis Storlie; Hongfang Liu

Assessment of Data Quality Variability across Two EHR Systems through a Case Study of Post-Surgical Complications

AMIA Jt Summits Transl Sci Proc. 2022 May 23:2022:196-205. eCollection 2022.

Authors

Sunyang Fu¹, Andrew Wen¹, Gavin M Schaeferle², Patrick M Wilson², Gabriel Demuth³, Xiaoyang Ruan¹, Sijia Liu¹, Curtis Storlie^{3

2}, Hongfang Liu¹

Affiliations

¹ Department of AI and Informatics Research, Mayo Clinic, Rochester, MN, USA.
² Kern Center for the Science of Healthcare Delivery, Mayo Clinic, Rochester, MN, USA.
³ Department of Quantitative Health Sciences, Mayo Clinic, Rochester, MN, USA.

PMID: 35854735
PMCID: PMC9285181

Abstract

Translation of predictive modeling algorithms into routine clinical care workflows faces challenges in the form of varying data quality-related issues caused by the heterogeneity of electronic health record (EHR) systems. To better understand these issues, we retrospectively assessed and compared the variability of data produced from two different EHR systems. We considered three dimensions of data quality in the context of EHR-based predictive modeling for three distinct translational stages: model development (data completeness), model deployment (data variability), and model implementation (data timeliness). The case study was conducted based on predicting post-surgical complications using both structured and unstructured data. Our study discovered a consistent level of data completeness, a high syntactic, and moderate-high semantic variability across two EHR systems, for which the quality of data is context-specific and closely related to the documentation workflow and the functionality of individual EHR systems.

Abstract

Grants and funding