dCollection 디지털 학술정보 유통시스템

Real-Time Lightweight Human Parsing Based on Class Relationship Knowledge Distillation

원문보기

주제(키워드) Human Parsing , Knowledge Distillation , Model Lightweight
주제(DDC) 006.31
발행기관 아주대학교
지도교수 황원준
발행년도 2023
학위수여년월 2023. 8
학위명 석사
학과 및 전공 일반대학원 인공지능학과
실제URI http://www.dcollection.net/handler/ajou/000000032930
본문언어 영어
저작권 아주대학교 논문은 저작권에 의해 보호받습니다.

초록/요약

In the field of computer vision, understanding human objectives is a crucial and chal- lenging task, as it requires recognizing and comprehending human presence and behavior in images or videos. Within this domain, human parsing is an extremely challenging task, as it necessitates accurately locating the human region and dividing it into multiple semantic areas. This is a dense prediction task that demands powerful computational capabilities and high-precision models. Recently, with the continuous development of computer vision technologies, human parsing has been widely applied to other tasks related to human ob- jectives, such as pose estimation, and human image generation. These applications are expected to play an increasingly important role in future artificial intelligence research. To achieve real-time human parsing tasks on devices with limited computational re- sources, we have designed and introduced a lightweight human parsing model. We chose Resnet18 as the core network structure and simplified the traditional pyramid module used to obtain high-definition contextual information, thus significantly reducing the complex- ity of the model. Additionally, to enhance the parsing accuracy of the model, we integrated a spatial attention fusion strategy. Our lightweight model exhibits efficient performance and achieves high segmentation accuracy on the commonly used dataset for human parsing tasks, Look into Person (LIP). Although traditional models perform excellently in terms of segmentation accuracy, their high complexity and abundance of parameters restrict their use on devices with limited computational resources. To further improve the accuracy of our lightweight network, we also implemented knowledge distillation techniques. The tra- ditional knowledge distillation method uses the Kullback-Leibler (KL) divergence to match the prediction probability scores of teacher-student models. However, this approach may be ineffective at learning useful knowledge when there is a significant difference between the teacher and student networks. Therefore, we adopted a new distillation standard, based on inter-class and intra-class relationships in prediction results, which significantly improves parsing accuracy. Empirical evidence has shown that, while maintaining high segmentation accuracy, our lightweight model has substantially reduced the number of parameters, thereby achieving our expected goals.

I Introduction 1
II Related Works 5
III Proposed Method 8
3.1 Framework Overview 9
3.2 Proposed Method 9
3.2.1 Effective model light-weighting methods 9
3.2.2 An Effective Lightweight Spatial Feature Fusion Attention Method for Human Parsing Models(LSFA) 10
3.2.3 Applying the intra-class and inter-class relationship approach to knowledge distillation 12
IV. Experimental Results and Discussion 16
4.1 Dataset 16
4.2 Implementation Details 16
4.3 Inference speed and performance 17
4.4 Ablation experiment 19
V Conclusion 21
References 22

반출 Meta View 목록

아주대학교

검색 상세

Real-Time Lightweight Human Parsing Based on Class Relationship Knowledge Distillation

초록/요약

목차