Assessment of Data Quality Variability across Two EHR Systems through a Case Study of Post-Surgical Complications

AMIA Jt Summits Transl Sci Proc. 2022 May 23:2022:196-205. eCollection 2022.

Abstract

Translation of predictive modeling algorithms into routine clinical care workflows faces challenges in the form of varying data quality-related issues caused by the heterogeneity of electronic health record (EHR) systems. To better understand these issues, we retrospectively assessed and compared the variability of data produced from two different EHR systems. We considered three dimensions of data quality in the context of EHR-based predictive modeling for three distinct translational stages: model development (data completeness), model deployment (data variability), and model implementation (data timeliness). The case study was conducted based on predicting post-surgical complications using both structured and unstructured data. Our study discovered a consistent level of data completeness, a high syntactic, and moderate-high semantic variability across two EHR systems, for which the quality of data is context-specific and closely related to the documentation workflow and the functionality of individual EHR systems.