dCollection 디지털 학술정보 유통시스템

Deep Learning Methods for Sign Language Production

원문보기

주제(키워드) sign language production , Transformer
주제(DDC) 006.31
발행기관 아주대학교 일반대학원
지도교수 Tae-Sun Chung
발행년도 2024
학위수여년월 2024. 2
학위명 박사
학과 및 전공 일반대학원 인공지능학과
실제URI http://www.dcollection.net/handler/ajou/000000033671
본문언어 한국어
저작권 아주대학교 논문은 저작권에 의해 보호받습니다.

1 Introduction 1
1.1 Introduction 1
1.2 Contributions of This Dissertation 6
1.3 Outlines 7
2 Background 9
2.1 Avatar Approaches for Sign Language Production 9
2.2 Deep Learning Approaches for Sign Language Production 10
2.2.1 Pro-Transformer 12
2.3 Summary 16
3 Datasets and Evaluation Metrics 18
3.1 Datasets 18
3.2 Evaluation Protocols 22
3.3 Evaluation Metrics 23
3.3.1 Back-Translation Model 24
3.3.2 Bilingual Evaluation Understudy (BLEU) 26
3.3.3 Recall-Oriented Understudy for Gisting Evaluation (ROUGE) 27
3.3.4 Word Error Rate (WER) 28
4 Cascade Dual-decoder Transformer for Sign Language Production 29
4.1 Cascade Dual-decoder Transformer 31
4.1.1 Text Encoder 32
4.1.2 Hand Pose Decoder 32
4.1.3 Sign Pose Decoder 33
4.2 Spatio-Temporal Loss 35
4.2.1 Spatial Regression Loss 35
4.2.2 Temporal Continuity Loss 36
4.3 Performance Evaluations 38
4.3.1 Model Configuration 38
4.3.2 Quantitative Results 39
4.3.2.1 Baseline Comparison 39
4.3.3 Ablation Study 41
4.3.3.1 Impact of Different Numbers of Decoder Layers 42
4.3.3.2 Effect of Spatio-Temporal Loss 43
4.3.4 Qualitative Analysis 44
4.4 Conclusions 48
5 Multi-Channel Spatio-Temporal Transformer for Sign Language Production 49
5.1 Problem Definition 49
5.2 Mutil-Channel Spatio-Temporal Transformer 51
5.2.1 Encoder 51
5.2.2 Multi-Channel Spatio-Temporal Decoder 52
5.2.2.1 Channel-Specific and Full-Channel Embedding 52
5.2.2.2 Spatial-Attention Module 53
5.2.2.3 Temporal-Attention Module 54
5.2.2.4 Spatio-Temporal Fusion Module 56
5.3 Performance Evaluations 57
5.3.1 Model Configuration 58
5.3.2 Quantitative Results 58
5.3.2.1 Baseline Comparison 58
5.3.2.2 Ablation Study 61
5.3.3 Qualitative Analysis 62
5.4 Conclusions 63
6 Conclusions and Future Work 66
6.1 Conclusions 66
6.2 Possible Future Work 67
Bibliography 69
A List of Research Outputs 75
A.1 SCI/SCIE Journal Papers 75
A.2 International Conference Papers 75

반출 Meta View 목록

검색 상세

Deep Learning Methods for Sign Language Production

목차