dCollection 디지털 학술정보 유통시스템

Trunk-to-Branch: Lightweight Multi-Sub Distillation with High-Order Feature Semantics

원문보기

주제(키워드) Knowledge Distillation
주제(DDC) 006.31
발행기관 아주대학교 일반대학원
지도교수 Jongbin Ryu
발행년도 2026
학위수여년월 2026. 2
학위명 석사
학과 및 전공 일반대학원 인공지능학과
실제URI http://www.dcollection.net/handler/ajou/000000035572
본문언어 영어
저작권 아주대학교 논문은 저작권에 의해 보호받습니다.

초록/요약

This paper proposes a novel online knowledge distillation framework that enables various vision models to generate and transfer knowledge from multiple perspectives using only lightweight additional networks. Previous online distillation studies have focused on self-distillation without pre-trained teacher models, aiming to generate and transfer new knowledge on their own. However, they suffer from several limitations, including high computational overhead, semantic gaps between stages, and redundant representations. To address these issues, we propose a method that creates multiple peer branches using only a single lightweight layer instead of stacking deep layers, and introduces a learning algorithm that reduces correlation among peer branches to enhance representational diversity. As a result, the backbone network can effectively learn diverse information from the peer branches while using minimal additional resources, and the peer branches are removed during inference, preserving the original model’s inference speed. Moreover, unlike previous studies that mainly demonstrated improvements on basic models such as ResNet, we validate the effectiveness of our approach on modern architectures such as ConvNeXt and CSWin. keyword: Knowledge distillation, Lightweight Peer Branches, Semantic Alignments.

I. Introduction 1
II. Related Work 4
A. Conventional Knowledge Distillation 4
B. Online Distillation 5
1) Self Knowledge Distillation 5
2) Multi-View Knowledge Distillation 6
III. Method 7
A. Semantic Permeation Structure 8
1) Feature Concatenation 8
2) Semantic Alignment via Backpropagation 9
B. Diverse Lightweight Peer Branches 10
1) Peer Branch Architecture 10
2) Decorrelation loss 11
3) Peer Branches Distillation 12
IV. Experiment 14
A. Experimental Setup 14
B. Image Classification 15
C. Comparison with conventional Distillation 19
V. Ablation 21
A. Influence from the SPS 21
B. Branch-wise and Ensemble Distillation 22
C. Hyperparameter Analysis 23
D. Branching Point Location 24
VI. Conclusion 27
References 28

반출 Meta View 목록

아주대학교

검색 상세

Trunk-to-Branch: Lightweight Multi-Sub Distillation with High-Order Feature Semantics

초록/요약

목차