Following is the list of accepted ICASSP 2024 papers, sorted by paper title. You can use the search feature of your web browser to find your paper number. Notifications to all authors have also been sent by email. If you have not received your notification of the results by email, please...
该文章首先argue了mixup在longtail dataset中的泛化能力,指出mixup可能在这种数据集中会导致更加严重的不均衡问题,这是因为mixup选择主类的随机性与原始数据的不平衡性,导致大部分的mixup数据实际上base on head class(见图1)。而现有的方法通常是通过调整mixup的参数增加生成图像尾部类的比例,或者使用重采样增加tail cl...
paper:http://arxiv.org/abs/2303.05338code:1 code implementation (in PyTorch)keywords: #多模态平衡 #多模态融合importance: #star4 tl;nr: 本文的领域:Audio-Visual Fine-Grained (AVFG) 在细粒度任务上(比如以下:不同鸟的种类和叫声),多模态联合训练时,发现前面介绍的方法OGM-GE、G-blending效果反而不如...
During lyrics transcription inference, a small trick is employed: we build asuppress_tokenlist. With this list, we force the model to only generate tokens that exist in the training set. We empirically found that this works well when the scale of available training data is small (in this ca...
The code for ICASSP 2023 paper: MRML: Multimodal Rumor Detection by Deep Metric Learning. - plw-study/MRML
19/11/2023 14:00: 可以查看审稿意见了:The icassp24 paper reviews (not completely done) are ...
人机交互17 18 20有希望嘛,前两个都有一个2,有戏吗?12.12更新,在accept paper list里查到中了...
This repository contains the pytorch code for the 2023 ICASSP paper "Preformer: Predictive Transformer with Multi-Scale Segment-wise Correlations for Long-Term Time Series Forecasting” - ddz16/Preformer
An official implementation of the ICASSP 2023 paper: SG-VAD: Stochastic Gates Based Speech Activity Detection - jsvir/vad
When replicating the two-stage training process from our paper (training with LibriTTS and then LibriTTS+VCTK), please put both list of speaker ids from LibriTTS and VCTK at global config. f0s_list_path is set to f0s.txt by default config/cota: Configs for training Cotatron. You may wa...