A narrative review on the validity of electronic health record-based research in epidemiology

BMC Med Res Methodol. 2021 Oct 27;21(1):234. doi: 10.1186/s12874-021-01416-5.

Abstract

Electronic health records (EHRs) are widely used in epidemiological research, but the validity of the results is dependent upon the assumptions made about the healthcare system, the patient, and the provider. In this review, we identify four overarching challenges in using EHR-based data for epidemiological analysis, with a particular emphasis on threats to validity. These challenges include representativeness of the EHR to a target population, the availability and interpretability of clinical and non-clinical data, and missing data at both the variable and observation levels. Each challenge reveals layers of assumptions that the epidemiologist is required to make, from the point of patient entry into the healthcare system, to the provider documenting the results of the clinical exam and follow-up of the patient longitudinally; all with the potential to bias the results of analysis of these data. Understanding the extent of as well as remediating potential biases requires a variety of methodological approaches, from traditional sensitivity analyses and validation studies, to newer techniques such as natural language processing. Beyond methods to address these challenges, it will remain crucial for epidemiologists to engage with clinicians and informaticians at their institutions to ensure data quality and accessibility by forming multidisciplinary teams around specific research projects.

Keywords: Bias; Data quality; Electronic health records; Secondary analysis; Validity.

Publication types

  • Research Support, N.I.H., Extramural
  • Review

MeSH terms

  • Bias
  • Delivery of Health Care*
  • Electronic Health Records*
  • Health Services Needs and Demand
  • Humans