검색 상세

Advanced Resource and Mobility Management for Multi-UAV Networks

초록/요약

In this dissertation, a comprehensive framework is pre- sented as a solution for the integrated optimization of UAV mobility and user allocation, specifically tailored for multi- UAV systems operating to ensure indoor emergency connec- tivity. The research is presented through three progressive stages that lead to an advanced, interference-aware coordina- tion framework. First, to establish a robust and realistic foundation for the research, this dissertation begins by performing an empirical validation of a floor-aware outdoor-to-indoor (O2I) channel model that incorporates floor penetration loss (FPL), using real-world measurements in a factory-type multi-story envi- ronment. This validation provides a credible and realistic ba- sis for all subsequent analysis and algorithm development. Second, building upon this validated model, this disser- tation then proceeds to frame the joint optimization chal- lenge within the mathematical structure of a partially ob- servable Markov decision process (POMDP). To address this challenge, a novel multi-agent deep reinforcement learning (MADRL) algorithm is introduced, which integrates priori- tized experience replay (PER) into the multi-agent double deep Q-network (MADDQN) framework. This stage focuses on deriving an effective solution for the core challenge within a half-duplex (HD) communication environment. Lastly, the framework is extended to a more complex full- duplex (FD) communication scenario, where UAVs must si- multaneously serve indoor users and first responders. To ad- dress the resulting intricate interference landscape, an ad- vanced MADRL algorithm, PER with attention-based multi- agent dueling double DQN (PER-A-MAD3QN), is proposed. This algorithm explicitly incorporates a self-attention mecha- nism, enabling each UAV to intelligently manage interference and enhance cooperative performance. Comprehensive simulations confirm the efficacy of the three- stage framework. The performance evaluation shows that, from the empirically validated O2I channel model through the advanced MADRL algorithms, the framework significantly re- duces mission completion time and improves service reliability in indoor emergency scenarios.

more

목차

1 Introduction 1
1.1 Background and motivation 1
1.2 Contributions 3
1.3 Overview of dissertation 5
2 Related work 7
2.1 Channel models for UAV-BS networks 7
2.2 MADRL for UAV-BS control 10
2.3 FD communication in UAV-BS networks 12
3 System model and empirical validation of the floor-aware O2I channel model 15
3.1 Network architecture and scenario 15
3.2 UAV-BS mobility model 17
3.3 Channel model and empirical validation 19
3.3.1 O2I channel model with FPL 19
3.3.2 Empirical validation 22
4 MADRL for joint optimization in HD indoor networks 29
4.1 System model and problem formulation 31
4.1.1 Uplink signal and interference with UAV-UE association model 31
4.1.2 UAV-BS energy consumption model 35
4.1.3 Problem formulation 38
4.2 Proposed MADRL learning framework 40
4.2.1 Decentralized POMDP formulation 41
4.2.1.1 State and observation space 41
4.2.1.2 Action space: a state-dependent, dual-mode policy 42
4.2.1.3 Reward structure 44
4.2.2 Proposed PER-MADDQN algorithm 45
4.2.2.1 Proposed double deep Q-networks 47
4.2.2.2 PER for efficient learning 49
4.2.3 Training workflow of proposed algorithm 51
4.3 Performance evaluation 53
4.3.1 Simulation setup 54
4.3.2 Implementation details and hyperparameters 54
4.3.3 Baseline algorithms 56
4.3.4 Simulation results and analysis 58
4.3.4.1 Reward convergence analysis 58
4.3.4.2 User connectivity performance 64
4.3.4.3 Analysis of user service quality 67
4.3.4.4 Qualitative analysis of learned policies 72
4.3.4.5 Scalability analysis in extended environment 76
4.4 Concluding remarks 78
5 Attention-based MADRL for joint optimization in FD indoor networks 80
5.1 System model and problem formulation 82
5.1.1 FD interference and UAV-UE association model 83
5.1.1.1 DL interference analysis 84
5.1.1.2 UL interference analysis 86
5.1.2 Interference-aware signal modeling 87
5.1.3 Problem formulation 88
5.2 Proposed MADRL framework 91
5.2.1 Decentralized POMDP formulation 91
5.2.1.1 State and observation space 91
5.2.1.2 Action space: a state-dependent, dual-mode policy 93
5.2.1.3 Multi-component reward structure 94
5.2.2 Proposed PER-A-MAD3QN algorithm 95
5.2.2.1 D3QN backbone for enhanced value estimation 97
5.2.2.2 Self-attention for interference-aware observation processing 98
5.2.3 Training workflow of proposed algorithm 99
5.2.3.1 Phase 1: Decentralized data collection 99
5.2.3.2 Phase 2: Centralized model update 101
5.3 Performance evaluation 101
5.3.1 Simulation results and analysis 103
5.3.1.1 Learning performance and convergence 103
5.3.1.2 Mission complete speed and scalability 105
5.3.1.3 FD service quality and energy efficiency 109
5.3.1.4 Analysis of learned agent behaviors 111
5.3.1.5 Comparative analysis between HD and FD 113
5.4 Concluding remarks 115
6 Conclusion 117
References 120

more