Abstract
High-throughput sequencing generates vast data, often containing low-quality bases, chimeras, and artifacts that can mislead taxonomic classification and diversity assessments. Divisive amplicon denoising algorithm 2 (DADA2) enhances taxonomic resolution by excluding low-quality bases and optimizing amplicon sequence variant inference. Proper truncation reduces computational load while maintaining key hypervariable regions for accurate classification. In this study, we examine the effect of various truncation lengths during the DADA2 analysis in ensuring statistical robustness and improving the reliability of microbial community profiling in ecological and environmental studies. Truncation of read length from 175 to 185 bp improves the quality read recovery rate, and preserves microbial diversity in the V4 hypervariable region of the Illumina paired-end reads. Incorporating the optimal truncation length strategy optimizes read recovery and preserves the richness and evenness of microbial communities.