Algorithms for the identification of prevalent diabetes in the All of Us Research Program validated using polygenic scores - a new resource for diabetes precision medicine

medRxiv [Preprint]. 2023 Sep 5:2023.09.05.23295061. doi: 10.1101/2023.09.05.23295061.

Abstract

Objective: The study aimed to develop and validate algorithms for identifying people with type 1 and type 2 diabetes in the All of Us Research Program (AoU) cohort, using electronic health record (EHR) and survey data.

Research design and methods: Two sets of algorithms were developed, one using only EHR data (EHR), and the other using a combination of EHR and survey data (EHR+). Their performance was evaluated by testing their association with polygenic scores for both type 1 and type 2 diabetes.

Results: For type 1 diabetes, the EHR-only algorithm showed a stronger association with T1D polygenic score (p=3×10-5) than the EHR+. For type 2 diabetes, the EHR+ algorithm outperformed both the EHR-only and the existing AoU definition, identifying additional cases (25.79% and 22.57% more, respectively) and showing stronger association with T2D polygenic score (DeLong p=0.03 and 1×10-4, respectively).

Conclusions: We provide new validated definitions of type 1 and type 2 diabetes in AoU, and make them available for researchers. These algorithms, by ensuring consistent diabetes definitions, pave the way for high-quality diabetes research and future clinical discoveries.

Publication types

  • Preprint