未来优化方向: 作者提议未来将探索将多尺度结构融合为单一精细-粗略流(fine-to-coarse stream),通过层间池化来降低冗余;该策略有望保留特征信息同时减少计算复杂度; 注意力机制解释性:尽管使用 attention 构建邻接矩阵,但没有详细解释关键连接对识别哪些类别动作有显著作用。
其实我比较感兴趣的部分是本文提出guided attention。因为音频具有时序性,因此attention的对齐满足对角线n~at,其中a~N/T(图片借鉴一下李宏毅老师的ppt)。guided attention把该条件带入到attention的限制中,每当对齐的值偏离该对角线,就要对齐惩罚。其函数如下所示。 3 实验 table 1列出本文的参数设置...
Unsupervised Attention-guided Image-to-Image Translation 这是NeurIPS 2018一篇图像翻译的文章。目前的无监督图像到图像的翻译技术很难在不改变背景或场景中多个对象交互方式的情况下将注意力集中在改变的对象上去。这篇文章的解决思路是使用注意力导向来进行图像翻译。下面是这篇文章的结果图: 可以看到文章结果很好, 只...
In this paper, a novel approach, which is based on attention guided 3D convolutional neural networks (CNN)-long short-term memory (LSTM) model, is proposed for speech based emotion recognition. The proposed attention guided 3D CNN-LSTM model is trained in end-to-end fashion. The input speech...
几篇论文实现代码:《Attention-Guided Hierarchical Structure Aggregation for Image Matting》(CVPR 2020) GitHub:http://t.cn/A6zS3oi3 《Blurry Video Frame Interpolation》(CVPR 2020) GitHub:http://t.c...
AtLoc: Attention Guided Camera Localization- AAAI 2020 (Oral). Bing Wang,Changhao Chen,Chris Xiaoxuan Lu,Peijun Zhao,Niki Trigoni, andAndrew Markham License Licensed under the CC BY-NC-SA 4.0 license, seeLICENSE. This is the PyTorch implementation ofAtLoc, a simple and efficient neural architect...
Atention-guided CNN for image denoising(ADNet)by Chunwei Tian, Yong Xu, Zuoyong Li, Wangmeng Zuo, Lunke Fei and Hong Liu is publised by Neural Networks (IF:9.657), 2020 (https://www.sciencedirect.com/science/article/pii/S0893608019304241) and it is implemented by Pytorch. This paper ...
To address this difficult problem, this paper proposes a novel end-to-end attention-guided method based on multi-branch convolutional neural network. To this end, we first construct a synthetic dataset with carefully designed low-light simulation strategies. The dataset is much larger and more ...
Unsupervised Attention-guided Image-to-Image Translation 目前的无监督Image-to-Image Translation很难在不改变背景或场景中多个对象的情况下将注意力集中在单个对象上。作者提出了一种方法来解决这个问题。 Introduction 目前的包括CycleGAN在内的许多无监督Image-to-Image Translation方法都无法只关注特定的场景对象,如下...
Guided Attention在一些任务中,输入输出是单调对齐(monotonically aligned)的,比如语音识别和语音合成(TTS),可能会出现计算attention score的顺序时混乱的情况,无法很好得到结果。Guided attention就是强迫attention有一个固定的形式,比如TTS问题需要从左往右计算attention,可以参考Monotonic Attention Location-aware attention。