An improved chromosome-level genome assembly and annotation of Echeneis naucrates

Sci Data. 2024 May 4;11(1):452. doi: 10.1038/s41597-024-03309-w.

Abstract

Echeneis naucrates, as known as live sharksucker, is famous for the behavior of attaching to hosts using a highly modified dorsal fin with oval-shaped sucking disc. Here, we generated an improved high-quality chromosome-level genome assembly of E. naucrates using Illumina short reads, PacBio long reads and Hi-C data. Our assembled genome spans 572.85 Mb with a contig N50 of 23.19 Mb and is positioned to 24 pseudo-chromosomes. Additionally, at least one telomere was identified for 23 out of 24 chromosomes. Furthermore, we identified a total of 22,161 protein-coding genes, of which 21,402 genes (96.9%) were annotated successfully with functions. The combination of ab initio predictions and Repbase-based searches revealed that 15.57% of the assembled E. naucrates genome was identified as repetitive sequences. The completeness of the genome assembly and the gene annotation were estimated to be 97.5% and 95.4% with BUSCO analyses. This work enhances the utility of the live sharksucker genome and provides a valuable groundwork for the future study of genomics, biology and adaptive evolution in this species.

Publication types

  • Research Support, Non-U.S. Gov't
  • Dataset

MeSH terms

  • Animals
  • Chromosomes*
  • Fishes* / genetics
  • Genome*
  • Molecular Sequence Annotation