GWAS significance thresholds for deep phenotyping studies can depend upon minor allele frequencies and sample size

Mol Psychiatry. 2021 Jun;26(6):2048-2055. doi: 10.1038/s41380-020-0670-3. Epub 2020 Feb 17.

Abstract

An important issue affecting genome-wide association studies with deep phenotyping (multiple correlated phenotypes) is determining the suitable family-wise significance threshold. Straightforward family-wise correction (Bonferroni) of p < 0.05 for 4.3 million genotypes and 335 phenotypes would give a threshold of p < 3.46E-11. This would be too conservative because it assumes all tests are independent. The effective number of tests, both phenotypic and genotypic, must be adjusted for the correlations between them. Spectral decomposition of the phenotype matrix and LD-based correction of the number of tested SNPs are currently used to determine an effective number of tests. In this paper, we compare these calculated estimates with permutation-determined family-wise significance thresholds. Permutations are performed by shuffling individual IDs of the genotype vector for this dataset, to preserve correlation of phenotypes. Our results demonstrate that the permutation threshold is influenced by minor allele frequency (MAF) of the SNPs, and by the number of individuals tested. For the more common SNPs (MAF > 0.1), the permutation family-wise threshold was in close agreement with spectral decomposition methods. However, for less common SNPs (0.05 < MAF ≤ 0.1), the permutation threshold calculated over all SNPs was off by orders of magnitude. This applies to the number of individuals studied (here 777) but not to very much larger numbers. Based on these findings, we propose that the threshold to find a particular level of family-wise significance may need to be established using separate permutations of the actual data for several MAF bins.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Gene Frequency / genetics
  • Genome-Wide Association Study*
  • Genotype
  • Humans
  • Polymorphism, Single Nucleotide* / genetics
  • Sample Size