Frequently used bioinformatics tools overestimate the damaging effect of allelic variants

Genes Immun. 2019 Jan;20(1):10-22. doi: 10.1038/s41435-017-0002-z. Epub 2017 Dec 4.

Abstract

We selected two sets of naturally occurring human missense allelic variants within innate immune genes. The first set represented eleven non-synonymous variants in six different genes involved in interferon (IFN) induction, present in a cohort of patients suffering from herpes simplex encephalitis (HSE) and the second set represented sixteen allelic variants of the IFNLR1 gene. We recreated the variants in vitro and tested their effect on protein function in a HEK293T cell based assay. We then used an array of 14 available bioinformatics tools to predict the effect of these variants upon protein function. To our surprise two of the most commonly used tools, CADD and SIFT, produced a high rate of false positives, whereas SNPs&GO exhibited the lowest rate of false positives in our test. As the problem in our test in general was false positive variants, inclusion of mutation significance cutoff (MSC) did not improve accuracy.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Child
  • Computational Biology / standards*
  • Encephalitis, Herpes Simplex / genetics*
  • False Positive Reactions
  • Female
  • Genetic Testing / standards*
  • Genome-Wide Association Study / standards*
  • HEK293 Cells
  • Humans
  • Male
  • Mutation, Missense
  • Polymorphism, Single Nucleotide
  • Receptors, Cytokine / genetics
  • Receptors, Cytokine / metabolism
  • Receptors, Interferon
  • Software / standards*

Substances

  • IFNLR1 protein, human
  • Receptors, Cytokine
  • Receptors, Interferon