embarrassingly+simple+dataset+distillation

2025-02-10 11:52:05

拼音 [ 拼音 ]

An Embarrassingly Simple Approach for Knowledge Distillation

Knowledge Distillation (KD) aims at improving the performance of a\nlow-capacity student model by inheriting knowledge from a high-capacity teacher\nmodel. Previous KD methods typically train a student by minimizing a\ntask-related loss and the KD loss simultaneously, using a pre-defined loss\n...