To mitigate this limitation, we propose an innovative framework, YOLO-DCTI, that capitalizes on the Contextual Transformer (CoT) framework for the detection of small or tiny objects. Specifically, within CoT, we seamlessly incorporate global residuals and local fusion mechanisms through...
Figure 1.The overall framework of our YOLO-DCTI; DCTI consists of CoT-I and Decoupled-Head. The FPN features are input to the CoT-I module for comprehensive modeling of global contextual information and spatial relationships. Subsequently, a dual-branch architecture is employed to effectively ext...