Optimization of enzymatic fragmentation is crucial to maximize genome coverage: a comparison of library preparation methods for Illumina sequencing

BMC Genomics. 2022 Feb 1;23(1):92. doi: 10.1186/s12864-022-08316-y.

Abstract

Background: Novel commercial kits for whole genome library preparation for next-generation sequencing on Illumina platforms promise shorter workflows, lower inputs and cost savings. Time savings are achieved by employing enzymatic DNA fragmentation and by combining end-repair and tailing reactions. Fewer cleanup steps also allow greater DNA input flexibility (1 ng-1 μg), PCR-free options from 100 ng DNA, and lower price as compared to the well-established sonication and tagmentation-based DNA library preparation kits.

Results: We compared the performance of four enzymatic fragmentation-based DNA library preparation kits (from New England Biolabs, Roche, Swift Biosciences and Quantabio) to a tagmentation-based kit (Illumina) using low input DNA amounts (10 ng) and PCR-free reactions with 100 ng DNA. With four technical replicates of each input amount and kit, we compared the kits' fragmentation sequence-bias as well as performance parameters such as sequence coverage and the clinically relevant detection of single nucleotide and indel variants. While all kits produced high quality sequence data and demonstrated similar performance, several enzymatic fragmentation methods produced library insert sizes which deviated from those intended. Libraries with longer insert lengths performed better in terms of coverage, SNV and indel detection. Lower performance of shorter-insert libraries could be explained by loss of sequence coverage to overlapping paired-end reads, exacerbated by the preferential sequencing of shorter fragments on Illumina sequencers. We also observed that libraries prepared with minimal or no PCR performed best with regard to indel detection.

Conclusions: The enzymatic fragmentation-based DNA library preparation kits from NEB, Roche, Swift and Quantabio are good alternatives to the tagmentation based Nextera DNA flex kit from Illumina, offering reproducible results using flexible DNA inputs, quick workflows and lower prices. Libraries with insert DNA fragments longer than the cumulative sum of both read lengths avoid read overlap, thus produce more informative data that leads to strongly improved genome coverage and consequently also increased sensitivity and precision of SNP and indel detection. In order to best utilize such enzymatic fragmentation reagents, researchers should be prepared to invest time to optimize fragmentation conditions for their particular samples.

Keywords: Next Generation Sequencing; Whole genome sequencing; enzymatic fragmentation; insert size; library preparation.

MeSH terms

  • Gene Library
  • Genome*
  • High-Throughput Nucleotide Sequencing*
  • Polymerase Chain Reaction
  • Sequence Analysis, DNA