Abstract
Microfauna play a crucial role in the ecological functioning and monitoring of activated sludge systems. However, existing studies often rely on manual observation and lack standardized image datasets for developing automated recognition models. Here, we present a Microfauna Dataset in Activated Sludge (MD-AS-2025), a publicly available image dataset designed to support object detection, classification, and related computer vision tasks. The dataset comprises two components: an annotated dataset (Dataset A) containing 4,000 full-frame microscopic images labeled across five morphologically and ecologically distinct microfaunal groups, and a cropped dataset (Dataset B), consisting of 14,257 single-object images generated from model-assisted detection and manual verification. All images were acquired under realistic activated sludge conditions using a high-throughput imaging system. The dataset reflects practical functional groupings commonly used in activated sludge analysis. MD-AS-2025 offers a reusable and extensible resource for developing intelligent microfauna analysis tools and provides a scalable framework for dataset construction in aquatic environments.