Abstract
Leaves are central indicators of photosynthesis and plant growth status, and their precise monitoring is crucial for smart agriculture. Dense leaf detection, as a foundation for leaf morphology analysis, must address challenges such as occlusion and overlap, directly enabling key tasks including phenotypic trait extraction, disease identification, and yield estimation. Leaves are the most important plant organs, and monitoring leaves is a crucial aspect of crop surveillance. Dense leaf detection plays an important role as a fundamental technology for leaf monitoring. Existing dense leaf detection methods rely on traditional modular detectors and generic feature extraction, lacking designs tailored to real-world dense leaf scenarios. The methods for dense leaf detection generally use traditional modular detectors and general feature extraction techniques, without designing methods specifically for dense leaves in reality. In detail, in complex field scenarios, it still faces challenges like incomplete individual feature extraction due to high leaf overlap and difficult network convergence caused by excessive leaf density. To this end, we propose the Leaf-DETR framework, which effectively addresses these challenges through the Progressive Feature Fusion Pyramid Network (P-FPN) and the Crowded Query Refinement Strategy (CQR). First, we construct the largest dense leaf detection dataset to date, containing 1696 images and 85,375 annotation boxes. Second, P-FPN alleviates the feature confusion problem of overlapping leaves through the multi-stage fusion of features and the Adaptive Feature Aggregation module (AFA), enhancing the interaction between low-level details and high-level semantics. Third, the CQR strategy significantly reduces the matching cost of crowded candidate boxes and improves the network convergence efficiency by culling a crowded query method and introducing a one-to-many matching mechanism. Finally, experimental results show that Leaf-DETR improves mAP@50 by 1% and AR@300 by 1.4% over the baseline model on our self-constructed dataset, outperforming existing detection methods. Furthermore, the model exhibits extremely fast training convergence and demonstrates strong generalization capability on both field-collected monitoring images and other staple crops, fully highlighting its practical value in complex agricultural scenarios. Finally, experiments show that Leaf-DETR outperforms existing detection methods on the self-built dataset and demonstrates good performance generalization in monitoring collected images, as well as for other staple food crops, which verifies its practicality in complex agricultural scenarios. The code and detailed information are available at http://leafdetr.samlab.cn.