How good are global DNA-based environmental surveys for detecting all protist diversity? Arcellinida as an example of biased representation

Environ Microbiol. 2024 Mar;26(3):e16606. doi: 10.1111/1462-2920.16606.

Abstract

Metabarcoding approaches targeting microeukaryotes have deeply changed our vision of protist environmental diversity. The public repository EukBank consists of 18S v4 metabarcodes from 12,672 samples worldwide. To estimate how far this database provides a reasonable overview of all eukaryotic diversity, we used Arcellinida (lobose testate amoebae) as a case study. We hypothesised that (1) this approach would allow the discovery of unexpected diversity, but also that (2) some groups would be underrepresented because of primer/sequencing biases. Most of the Arcellinida sequences appeared in freshwater and soil, but their abundance and diversity appeared underrepresented. Moreover, 84% of ASVs belonged to the suborder Phryganellina, a supposedly species-poor clade, whereas the best-documented suborder (Glutinoconcha, 600 described species) was only marginally represented. We explored some possible causes of these biases. Mismatches in the primer-binding site seem to play a minor role. Excessive length of the target region could explain some of these biases, but not all. There must be some other unknown factors involved. Altogether, while metabarcoding based on ribosomal genes remains a good first approach to document microbial eukaryotic clades, alternative approaches based on other genes or sequencing techniques must be considered for an unbiased picture of the diversity of some groups.

MeSH terms

  • Amoeba*
  • DNA
  • Eukaryota* / genetics
  • Phylogeny
  • Soil

Substances

  • DNA
  • Soil