本文提出一个新的方法:面向任务的特征蒸馏(TOFD)。 文中使用的transformation,其结构是conv layers,训练使用的loss是task loss。这样做的好处是:能捕获特征中面向任务的信息,并distill给students。 除此之外,在TOFD的特征调整层上使用了orthogonal loss,这提高了Knowledge Distillation(KD)的performance。 介绍: 知识蒸馏...
1. Squeeze-and-Excitation Networks(SEnet) 上一篇文章中提到的《learning a Discriminative Feature Network for Semantic Segmantation 》中的CAB模块为每个特征图的通道赋予了不同的… hanyf 【论文笔记】Self-Supervised Learning Disentangled Group Representation as Feature Buqi发表于磕盐记闻 【论文笔记】AAAI2022...
In this paper, we propose a novel distillation method named task-oriented feature distillation (TOFD) where the transformation is convolutional layers that are trained in a data-driven manner by task loss. As a result, the task-oriented information in the features can be captured and distilled ...
However, the stored exemplars would violate the data privacy concerns, while the stored prototypes might not reasonably be consistent with a proper feature distribution, hindering the exploration of real-world CIL applications. In this paper, we propose a method of extit{e}mbedding distillation and...
Target detection model distillation using feature transition and label registration for remote sensing imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 5416–5426. [Google Scholar] [CrossRef] Wang, K.; Liew, J.H.; Zou, Y.; Zhou, D.; Feng, J. Panet: Few-shot ...
SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6–10 December 2023; Bouamor, H., Pino, J., Bali, K., Eds.; Association for Computational Linguistics: Kerrville...