Metagenomic analysis of viral genes integrated in whole genome sequencing data of Thai patients with Brugada syndrome

Genomics Inform. 2022 Dec;20(4):e44. doi: 10.5808/gi.22047. Epub 2022 Dec 30.

Abstract

Brugada syndrome (BS) is an autosomal dominant inheritance cardiac arrhythmia disorder associated with sudden death in young adults. Thailand has the highest prevalence of BS worldwide, and over 60% of patients with BS still have unclear disease etiology. Here, we performeda new viral metagenome analysis pipeline called VIRIN and validated it with whole genome sequencing (WGS) data of HeLa cell lines and hepatocellular carcinoma. Then the VIRIN pipelinewas applied to identify viral integration positions from unmapped WGS data of Thai males, including 100 BS patients (case) and 100 controls. Even though the sample preparation had noviral enrichment step, we can identify several virus genes from our analysis pipeline. The predominance of human endogenous retrovirus K (HERV-K) viruses was found in both cases andcontrols by blastn and blastx analysis. This study is the first report on the full-length HERV-Kassembled genomes in the Thai population. Furthermore, the HERV-K integration breakpointpositions were validated and compared between the case and control datasets. Interestingly,Brugada cases contained HERV-K integration breakpoints at promoters five times more oftenthan controls. Overall, the highlight of this study is the BS-specific HERV-K breakpoint positionsthat were found at the gene coding region "NBPF11" (n = 9), "NBPF12" (n = 8) and longnon-coding RNA (lncRNA) "PCAT14" (n = 4) region. The genes and the lncRNA have been reported to be associated with congenital heart and arterial diseases. These findings provide another aspect of the BS etiology associated with viral genome integrations within the humangenome.

Keywords: Brugada syndrome; VIRIN; human endogenous retrovirus K; metagenome; virus integration breakpoint; whole genome sequencing.