A gene pathogenicity tool "GenePy" identifies missed biallelic diagnoses in the 100,000 Genomes Project

Genet Med. 2024 Apr;26(4):101073. doi: 10.1016/j.gim.2024.101073. Epub 2024 Jan 18.

Abstract

Purpose: The 100,000 Genomes Project diagnosed a quarter of affected participants, but 26% of diagnoses were not on the applied gene panel(s); with many being de novo variants. Assessing biallelic variants without a gene panel is more challenging.

Methods: We sought to identify missed biallelic diagnoses using GenePy, which incorporates allele frequency, zygosity, and a user-defined deleterious metric, generating an aggregate GenePy score per gene, per participant. We calculated GenePy scores for 2862 recessive disease genes in 78,216 100,000 Genomes Project participants. For each gene, we ranked participant GenePy scores and scrutinized affected participants without a diagnosis, whose scores ranked among the top 5 for each gene. In cases which participant phenotypes overlapped with the disease gene of interest, we extracted rare variants and applied phase, ClinVar, and ACMG classification.

Results: 3184 affected individuals without a molecular diagnosis had a top-5-ranked GenePy score and 682 of 3184 (21%) had phenotypes overlapping with a top-ranking gene. In 122 of 669 (18%) phenotype-matched cases (excluding 13 withdrawn participants), we identified a putative missed diagnosis (2.2% of all undiagnosed participants). A further 334 of 669 (50%) cases have a possible missed diagnosis but require functional validation.

Conclusion: Applying GenePy at scale has identified 456 potential diagnoses, demonstrating the value of novel diagnostic strategies.

Keywords: Diagnostic uplift; Next-generation sequencing; Novel methods; Rare disease; Recessive disease.

MeSH terms

  • Gene Frequency / genetics
  • Genes, Recessive
  • Humans
  • Missed Diagnosis*
  • Phenotype
  • Virulence