Abstract
Detecting dynamic and amorphous objects like fire and smoke poses significant challenges in object detection. To address this, we propose Dual-Path Cascade Stochastic DETR (Dual-Path CSDETR). Unlike Cascade DETR, our model introduces cascade stochastic attention (CSA) to model the irregular morphologies of fire and smoke through variational inference, combined with a dual-path architecture that enables bidirectional feature interaction for enhanced learning efficiency. By integrating object-centric priors from bounding boxes into each decoder layer, the model refines attention mechanisms to focus on critical regions. Experiments show that Dual-Path CSDETR achieves 94% AP50 on fire/smoke detection, surpassing deterministic baselines.