Young genes have distinct gene structure, epigenetic profiles, and transcriptional regulation

Genome Res. 2018 Nov;28(11):1675-1687. doi: 10.1101/gr.234872.118. Epub 2018 Sep 19.

Abstract

Species-specific, new, or "orphan" genes account for 10%-30% of eukaryotic genomes. Although initially considered to have limited function, an increasing number of orphan genes have been shown to provide important phenotypic innovation. How new genes acquire regulatory sequences for proper temporal and spatial expression is unknown. Orphan gene regulation may rely in part on origination in open chromatin adjacent to preexisting promoters, although this has not yet been assessed by genome-wide analysis of chromatin states. Here, we combine taxon-rich nematode phylogenies with Iso-Seq, RNA-seq, ChIP-seq, and ATAC-seq to identify the gene structure and epigenetic signature of orphan genes in the satellite model nematode Pristionchus pacificus Consistent with previous findings, we find young genes are shorter, contain fewer exons, and are on average less strongly expressed than older genes. However, the subset of orphan genes that are expressed exhibit distinct chromatin states from similarly expressed conserved genes. Orphan gene transcription is determined by a lack of repressive histone modifications, confirming long-held hypotheses that open chromatin is important for new gene formation. Yet orphan gene start sites more closely resemble enhancers defined by H3K4me1, H3K27ac, and ATAC-seq peaks, in contrast to conserved genes that exhibit traditional promoters defined by H3K4me3 and H3K27ac. Although the majority of orphan genes are located on chromosome arms that contain high recombination rates and repressive histone marks, strongly expressed orphan genes are more randomly distributed. Our results support a model of new gene origination by rare integration into open chromatin near enhancers.

Publication types

  • Research Support, Non-U.S. Gov't

MeSH terms

  • Animals
  • Chromatin Assembly and Disassembly
  • Epigenesis, Genetic*
  • Evolution, Molecular*
  • Helminth Proteins / chemistry
  • Helminth Proteins / genetics*
  • Helminth Proteins / metabolism
  • Histone Code
  • Rhabditida / genetics*
  • Rhabditida / metabolism
  • Transcriptional Activation

Substances

  • Helminth Proteins