Dense Cross-Modal Correspondence Estimation With the Deep Self-Correlation Descriptordoi:10.1109/tpami.2020.2965528Seungryong KimDongbo MinStephen LinKwanghoon SohnIEEE Computer Society
在单个模态的特征提取上, 从AlexNet [16]在ImageNet图像分类任务中出现突破性成就以来, 卷积神经网络(convolutional neural networks, CNN) 结构在计算机视觉领域一直占主导地位, 基于CNN已经有大量的研究工作[17,18], 遥感图像最常见的表示学习也是基于CNN的[19,20]. 近年来, 源于自然语言处理的Transformer结构, 在...
b, Global alignment result of partial image to CCFv3. The left column shows the maximum intensity projection of globally aligned partial brain (top) and its blended view with CCFv3 average template (bottom), with their three-view shown on the right. c, Result after semi-automatic refinement...
In this paper, we propose a deep Cross-Modal Attention Network (CMAN) for joint entity and relation extraction. The network is carefully constructed by stacking multiple attention units in depth to fully model dense interactions over token-label spaces, in which two basic attention units are ...
Dense-captioning events in videos. In: Proceedings of the IEEE Conference on Computer Vision, Venice, 2017. 706--715. Google Scholar [37] Wang S J, Wang R P, Yao Z W, et al. Cross-modal scene graph matching for relationship-aware image-text retrieval. In: Proceedings of Winter ...
@inproceedings{peng2021sparse, title={Sparse-to-dense Feature Matching: Intra and Inter domain Cross-modal Learning in Domain Adaptation for 3D Semantic Segmentation}, author={Peng, Duo and Lei, Yinjie and Li, Wen and Zhang, Pingping and Guo, Yulan}, booktitle={Proceedings of the International...
基于样例的图像合成(Exemplar-based image synthesis):最近,一些工作 [39, 44, 34, 40, 2] 提出:在样例的指导下从语义布局合成逼真的图像。非参数或半参数方法(Non-parametric or semi-parametric approaches) [39, 2] 通过将大型数据库中检索到的图像片段进行组合来合成图像。然而,主流的工作将问题表述为图像到...
在推理时,仅保留学生模型分支。在BEVDistill中,涉及到两种知识蒸馏,包括BEV特征以及检测结果,分别称为dense feature distillation和sparse instance distillation。我们可以发现这两种蒸馏前面都有一个限定词,其中dense指的是需要对BEV特征的每个位置都要对齐,而sparse指的是目标(也就是实例)在空间中是稀疏分布的。
To generate crops on pairs of images while maintaining overlaps, we rely on quasi-dense key- point matching, namely DeepMatching [60], except for pairs from ARKitScenes where we directly use matches from the mesh. Given the matches, we consider a...
Cross-modal missing time-series imputation using dense spatio-temporal transformer netsdoi:10.3934/mbe.2024220Qian, XushengZhang, TengMiao, MengXu, GaojunZhang, XuanchengYu, WenwuChen, DuxinMathematical Biosciences & Engineering