dCollection 디지털 학술정보 유통시스템

On-policy Deep Reinforcement Learning for HPC Job Scheduling: Enhancing Performance Stability through Dynamic Data Selection

원문보기

주제(키워드) job scheduling , deep reinforcement learning , self-attention , high-performance computing
주제(DDC) 006.31
발행기관 아주대학교 일반대학원
지도교수 Sangyoon Oh
발행년도 2024
학위수여년월 2024. 2
학위명 석사
학과 및 전공 일반대학원 인공지능학과
실제URI http://www.dcollection.net/handler/ajou/000000033345
본문언어 영어
저작권 아주대학교 논문은 저작권에 의해 보호받습니다.

초록/요약

Job scheduling in High-Performance Computing (HPC) systems is a crucial task that determines the allocation of computational resources. Traditional heuristic algorithms often fail to fully capture the complexity of job scheduling. Reinforcement learning (RL) offers promising advancements. However, the performance of on-policy RL algorithms can be significantly influenced by the job data, leading to variability in performance. To enhance performance stability, we propose a novel dynamic data selection method. We predict the reward value using a tree-based machine learning model and select the data based on this prediction. This unique data selection process refines the input to the RL algorithm, improving performance stability. Furthermore, we introduce a self-attention-based on-policy network for job scheduling in HPC systems. This network more effectively utilizes the selected data when formulating policies. We validate our proposed method through experiments based on real-world job log data from HPC systems, comparing its performance with other heuristic scheduling algorithms. The results confirm the effectiveness of our approach in enhancing performance stability across real-world workloads and improving the overall performance of on- policy RL algorithm.

Ⅰ INTRODUCTION 1
Ⅱ RELATED WORKS 4
1. HPC Job Scheduling 4
2. Reinforcement Learning-based Job Scheduling 5
3. Data Selection for Reinforcement Learning 6
Ⅲ BACKGROUND 8
1. Overview of Reinforcement Learning 8
2. Off-Policy and On-Policy Reinforcement Learning 8
3. Proximal Policy Optimization 10
4. Self-Attention Mechanism 11
Ⅳ DYNAMIC DATA SELECTION WITH DEEP REINFORCEMENT LEARNING AGENT 12
1. Dynamic Data Selection 13
2. The complexity of the DS 16
3. Self-Attention-based Actor-Critic Network 16
4. Data Selection and Self-Attention Actor-Critic Network Algorithm 19
Ⅴ EXPERIMENTS 21
A. Experiments Setup 21
1. HPC job data 21
2. Compared Algorithms 22
3. DS-DRL Evaluation 23
4. Evaluation Metrics 24
B. Experimental results and analytics 26
1. Evaluation of Dynamic Data Selection in Reward Prediction 26
2. Impact of Data Selection Method on System Performance 27
3. Comparative Analysis of Scheduling Algorithms on Average Bounded Slowdown 30
4. Comparative Analysis of Scheduling Algorithms on Waiting Time 35
5. Comparative Evaluation with other real-world datasets 35
Ⅵ CONCLUSION 38
REFERENCE 39

반출 Meta View 목록

검색 상세

On-policy Deep Reinforcement Learning for HPC Job Scheduling: Enhancing Performance Stability through Dynamic Data Selection

초록/요약

목차