论文题目:Fusion-Mamba for Cross-modality Object Detection arXiv:2404.09146多模态融合是一个经典方法,在实验室环境中,通过简单融合可以取得很好的性能。然而,在现实世界中,不同模态的相机焦距、放置位置和角度差异很大,导致跨模态融合变得极具挑战性。与现有的融合方法不同,作者构建了Fusion-Mamba模块,用于在隐空间中...
点击关注@CVer官方知乎账号,可以第一时间看到最优质、最前沿的CV、AI、AIGC工作~ 快点击进入:Mamba和low-level技术交流平台 Fusion-Mamba Fusion-Mamba for Cross-modality Object Detection 单位:北航, 华东师大,腾讯优图, 东方理工大学 论文:https://arxiv.org/abs/2404.09146 CVPR 2024 论文和开源项目合集请戳...
Cross-modality fusing complementary information from different modalities effectively improves object detection performance, making it more useful and robust for a wider range of applications. Existing fusion strategies combine different types of images or merge different backbone features through elaborated neu...
Fusion-Mamba (ours)YOLOv585.057.580.392.891.973.084.887.1 YOLOv8l-IRYOLOv879.553.182.990.990.064.663.085.9 YOLOv8l-RGB [13]YOLOv880.952.570.692.991.269.675.386.0 Fusion-Mamba (ours)YOLOv888.061.984.394.292.980.587.588.8 Table 3: Comparison results with SOTA methods on FLIR-Aligned Dataset. Th...
Additionally, we devise a dynamic feature fusion module (DFFM) comprising two dynamic feature enhancement modules (DFEM) and a cross modality fusion mamba module (CMFM). The former serves for dynamic texture enhancement and dynamic difference perception, whereas the latter enhances correlation ...
Code Edit Lizhe1228/MambaDFuse official 56 Tasks Edit Image Reconstruction Mamba object-detection Object Detection Datasets Edit Add Datasets introduced or used in this paper Results from the Paper Edit Submit results from this paper to get state-of-the-art GitHub badges and help the commu...
Official implementation for “MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion.” - Lizhe1228/MambaDFuse
Locality guided cross-modal feature aggregation and pixel-level fusion for multispectral pedestrian detection, Information Fusion 2022, Yanpeng Cao et al. [PDF] Target-aware Dual Adversarial Learning and a Multi-scenario Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection, CVPR ...
Similar to the cross-entropy loss, \({G}_{\left(x,y\right)}?\left(\text{0,1}\right)\) denotes whether the pixels at coordinates \(\left(x,y\right)\) belong to the object in the ground truth values of vegetable diseases. \({S}_{\left(x,y\right)}\) represents the predicted...
MSHP3D: Multi-stage cross-modal fusion based on Hybrid Perception for indoor 3D object detection Xiangyang Jiang, Dakai Wang, Kunpeng Bi, Shuang Wang, Miaohui Zhang Article 102591 select article TDF-Net: Trusted Dynamic Feature Fusion Network for breast cancer diagnosis using incomplete multimodal ...