Ensemble genomic analysis in human lung tissue identifies novel genes for chronic obstructive pulmonary disease

Hum Genomics. 2018 Jan 15;12(1):1. doi: 10.1186/s40246-018-0132-z.

Abstract

Background: Genome-wide association studies (GWAS) have identified single nucleotide polymorphisms (SNPs) significantly associated with chronic obstructive pulmonary disease (COPD). However, many genetic variants show suggestive evidence for association but do not meet the strict threshold for genome-wide significance. Integrative analysis of multiple omics datasets has the potential to identify novel genes involved in disease pathogenesis by leveraging these variants in a functional, regulatory context.

Results: We performed expression quantitative trait locus (eQTL) analysis using genome-wide SNP genotyping and gene expression profiling of lung tissue samples from 86 COPD cases and 31 controls, testing for SNPs associated with gene expression levels. These results were integrated with a prior COPD GWAS using an ensemble statistical and network methods approach to identify relevant genes and observe them in the context of overall genetic control of gene expression to highlight co-regulated genes and disease pathways. We identified 250,312 unique SNPs and 4997 genes in the cis(local)-eQTL analysis (5% false discovery rate). The top gene from the integrative analysis was MAPT, a gene recently identified in an independent GWAS of lung function. The genes HNRNPAB and PCBP2 with RNA binding activity and the gene ACVR1B were identified in network communities with validated disease relevance.

Conclusions: The integration of lung tissue gene expression with genome-wide SNP genotyping and subsequent intersection with prior GWAS and omics studies highlighted candidate genes within COPD loci and in communities harboring known COPD genes. This integration also identified novel disease genes in sub-threshold regions that would otherwise have been missed through GWAS.

Trial registration: ClinicalTrials.gov NCT00608764 NCT00292552.

Keywords: Bayesian methods; Ensemble methods; Expression QTL; Integrative genomics; Network medicine; eQTL.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't
  • Research Support, U.S. Gov't, Non-P.H.S.
  • Research Support, U.S. Gov't, P.H.S.

MeSH terms

  • Activin Receptors, Type I / genetics
  • Adult
  • Aged
  • Female
  • Gene Expression Regulation
  • Genetic Predisposition to Disease*
  • Genome, Human / genetics*
  • Genome-Wide Association Study*
  • Genomics
  • Heterogeneous-Nuclear Ribonucleoprotein Group A-B / genetics
  • Humans
  • Lung / metabolism
  • Male
  • Middle Aged
  • Polymorphism, Single Nucleotide / genetics
  • Pulmonary Disease, Chronic Obstructive / genetics*
  • Pulmonary Disease, Chronic Obstructive / pathology
  • Quantitative Trait Loci / genetics
  • RNA-Binding Proteins / genetics
  • tau Proteins / genetics

Substances

  • HNRNPAB protein, human
  • Heterogeneous-Nuclear Ribonucleoprotein Group A-B
  • MAPT protein, human
  • PCBP2 protein, human
  • RNA-Binding Proteins
  • tau Proteins
  • ACVR1B protein, human
  • Activin Receptors, Type I

Associated data

  • ClinicalTrials.gov/NCT00608764
  • ClinicalTrials.gov/NCT00292552