High-Throughput Algorithm for Discovering New Drug Indications by Utilizing Large-Scale Electronic Medical Record Data

Clin Pharmacol Ther. 2020 Dec;108(6):1299-1307. doi: 10.1002/cpt.1980. Epub 2020 Aug 13.

Abstract

Drug repositioning is an effective way to mitigate the production problem in the pharmaceutical industry. Electronic medical record (EMR) databases harbor a large amount of data on drug prescriptions and laboratory test results and may thus be useful for finding new indications for existing drugs. Here, we present a novel high-throughput data-driven algorithm that identifies and prioritizes drug candidates that show significant effects on specific clinical indicators by utilizing large-scale EMR data. We chose four laboratory tests as clinical indicators: hemoglobin A1c (HbA1c), low-density lipoprotein (LDL) cholesterol, triglycerides (TGs), and high-density lipoprotein (HDL) cholesterol. From a 5-year EMR database, we generated datasets consisting of paired data with averaged measurement values during on and off each drug in each patient, adjusted for co-administered drug effects at each timepoint, and applied one sample t-test with the Bonferroni correction for statistical analysis. Among 1,774 drugs, 45 were associated with increases in HDL cholesterol, and 41, 146, and 65 were associated with reductions in HbA1c, LDL cholesterol, and TGs, respectively. We compared the list of candidate drugs with that of drugs indicated for relevant clinical conditions and found that the algorithm had high values for both sensitivity (range 0.95-1.00) and negative predictive value (range 0.95-1.00). Our algorithm was able to rediscover well-known drugs that are used for diabetes and dyslipidemia while revealing potential candidates without current indications but have shown promising results in the literature. Our algorithm may facilitate the repositioning of drugs with proven safety profiles.

Publication types

  • Research Support, Non-U.S. Gov't
  • Validation Study

MeSH terms

  • Algorithms*
  • Biomarkers / blood
  • Cholesterol, HDL / blood
  • Cholesterol, LDL / blood
  • Data Mining*
  • Databases, Factual
  • Drug Prescriptions
  • Drug Repositioning*
  • Electronic Health Records*
  • Glycated Hemoglobin / analysis
  • Humans
  • Hypoglycemic Agents / adverse effects
  • Hypoglycemic Agents / therapeutic use*
  • Hypolipidemic Agents / adverse effects
  • Hypolipidemic Agents / therapeutic use*
  • Reproducibility of Results
  • Time Factors
  • Triglycerides / blood

Substances

  • Biomarkers
  • Cholesterol, HDL
  • Cholesterol, LDL
  • Glycated Hemoglobin A
  • Hypoglycemic Agents
  • Hypolipidemic Agents
  • Triglycerides
  • hemoglobin A1c protein, human