dCollection 디지털 학술정보 유통시스템

Action Segmentation using Bezier Curvature as Spatio-Temporal Feature by Triplet Learning

원문보기

주제(키워드) Action Segmentation
주제(DDC) 006.31
발행기관 아주대학교
지도교수 손경아
발행년도 2022
학위수여년월 2022. 2
학위명 석사
학과 및 전공 일반대학원 인공지능학과
실제URI http://www.dcollection.net/handler/ajou/000000031451
본문언어 영어
저작권 아주대학교 논문은 저작권에 의해 보호받습니다.

초록/요약

With the development of recording technologies, the demand for video-based techniques is increasing. Despite the success in action segmentation which classifies short trimmed video, it remains a challenge to use long untrimmed videos. Action segmentation is the field of detecting and temporally locating segments in a video. Although previous approaches have shown an outstanding architectural development, the feature extractor remains. Recent approaches require additional temporal information such as action boundary information, which is difficult to obtain in real-world assumptions. This is because temporal features are not as well developed as spatial features. In this thesis, we propose a new feature synthesis framework, called a Temporal Curvature Feature (TCF). This framework consists of two stages: (a) framewise embedding and (b) curvature synthesis. In framewise embedding stage, we use a triplet network to map a video into T points. which are based on each action label corresponding to the frame. In curvature synthesis stage, we approximate a curve with these embedding points and synthesize the curvatures from the curve. These curvatures are used to enhance the temporal information of data through a framewise residual operation. The outputs have the same shape as the old shape and are used as the new input to bring out the potential from various models. To validate the effectiveness of our approach, curvatures are plugged into three action segmentation datasets, i.e., GTEA, 50Salads, and Breakfast, and we use the new input to train the previous state-of-the-art models: MS-TCN, MS-TCN2, ASRF, and ASFormer. The result tables show the overall increases in the performances. In particular, the F1 scores show the effectiveness of the approach in solving segmentation problem. Finally, the figures demonstrate that the curvature helps the model to better understand the temporal information.

1 Introduction 1
2 Related Works 4
3 Method 6
3.1 Framewise Embedding 7
3.1.1 Triplet Network for Video 7
3.1.2 Reorganization for Triplet Selection 8
3.2 Curvature Synthesis 9
3.2.1 Bezier Curve Principle 9
3.2.2 Continuous Temporal Information 9
3.2.3 Discrete Temporal Information 10
3.3 Action Segmentation from Curvature 10
4 Experiment 12
4.1 Datasets 12
4.2 Metrics 12
4.3 Backbone Models 13
4.4 Quantitative Results 13
4.4.1 Comparison with the state-of-the-art on GTEA dataset 14
4.4.2 Comparison with the state-of-the-art on 50salads dataset 15
4.4.3 Comparison with the state-of-the-art on Breakfast dataset 16
4.5 Qualitative Results 17
4.5.1 Curvature Effect on Backbone 1 17
4.5.2 Curvature Effect on Backbone 2 18
4.5.3 Curvature Effect on Backbone 3 19
4.6 Effect of Reorganization 20
4.6.1 Partition Selection 20
4.6.2 Successive Selection 21
4.6.3 Reorganization 22
5 Conclusion 23

반출 Meta View 목록

아주대학교

검색 상세

Action Segmentation using Bezier Curvature as Spatio-Temporal Feature by Triplet Learning

초록/요약

목차