검색 상세

Random Access Resource Selection in Mobile Networks with Machine Learning

초록/요약

In future generation network technologies, such as 3GPP’s Narrow- Band Internet of Things (NB-IoT), smart meters, 5G V2X, etc., there is a need to address the uRLLC use case (ultra-reliable and low latency communications). Estimates show that by the end of 2025, 50 billion devices will be connected across the globe. These applications dwell on the aforementioned technologies. Thus, the huge rollout of communicating devices causes congestion and latency. Extensive studies and research have to be conducted to cope up with reducing the latency and improving the reliability of the supporting technologies. For transmission, we focus on 5G’s millimeter-wave (mmWave) bands which alleviate the problem of bandwidth scarcity. However high-frequency bands do not cover larger distances. The coverage problem is addressed by using a heterogeneous network that comprises numerous small and macrocells, defined by transmission and reception points (TRxPs). For such a network, random access (RA) is the pivotal task for any mobile node to connect to TRxPs (AP), which experiences delay and congestion. The state of art RA resource selection does not fulfill the envisioned latency requirement of 5G and beyond 5G. RA is considered a challenging function in which users attempt to select an efficient TRxP within a given time. Ideally, an efficient TRxP is less congested, minimizing delays in users’ random access. However, owing to the nature of random access, it is not feasible to deploy a centralized controller estimating the congestion level of each cell and delivering this information back to users during random access. To solve this problem, we establish an optimization problem and employ a machine- learning-based scheme. Additionally, an important aspect of such estimation is that it can be approached based on machine learning architecture and hyperparameters search. Such a method in a mobile environment is an open research question for RA, in terms of increasing the key performance indicator (KPI) and rewards. Most studies in literature engage neural architecture search for supervised learning tasks, and there is limited work with regard to deep reinforcement learning (DRL). Therefore, we also propose an optimization framework to address the search problem of neural network architecture. The goal is to optimize the DRL algorithm so that it yields an additional performance improvement. This framework consists of a performance extrapolation algorithm. Through extensive simulations, we demonstrate that our proposed machine learning-based approach and revealed architectures to solving the problem discussed improve performance on random access. Specifically, experiments demonstrate a reduction of the RA delay and significant improvement of the access success probability. Furthermore, the proposed extrapolation algorithm based on LSTM shows better learning curve prediction as compared to notable regression techniques, such as linear interpolation.

more

목차

1. Introduction 1
1.1. Organization 1
1.2. Mobile Networks 2
1.3. Markov Decision Process and Machine Learning Algorithms 3
1.4. Literature Review 4
1.4.1. AI for Wireless Networking 4
1.4.2. Information Redundancy for RA 5
1.4.3. Architectural Improvements for RA 7
1.4.4 . Neural Architecture Search (NAS) 8
2. Reinforcement Learning for 5G Random Access 10
2.1. The 5G System. 10
2.2. Initial RA procedure and RA selection problem 15
2.3. AI, Machine Learning, and Recent Data Science Activities 17
2.4. Reinforcement Learning 19
3. RA Resource Selection Using DRL in 5G 21
3.1. Introduction 21
3.2. Problem Formulation and System Model 26
3.2.1. Traffic Model 27
3.2.2. Combined Channel Model 28
3.2.3. Problem Formulation 28
3.3. RL Based Selection of TRxPs for RA 32
3.3.1. TRxPs Search and Selection 32
3.3.2. System Parameters through SIB2 33
3.3.3 . Proposed RL Based Selection for RA 33
3.3.4. Design 36
3.3.5. Algorithms 40
3.4. Evaluation 44
3.4.1. Experimental Setup 45
3.4.2 . Performance Metric Measures 47
3.4.3. Learning Performance 48
3.4.4 . Impact of Proposed Algorithm on RA KPIs 49
4. RA Resource Selection via DQN Architecture Search 54
4.1. Introduction 54
4.2. Problem Statements 54
4.2.1. Sub Problem definition 1: NAS and HP Search 55
4.2.2. Sub Problem definition 2: DQN's Learning Curve Prediction problem 56
4.3. Proposed Solutions 57
4.3.1. DRL Optimization Framework 58
4.3.2. DRL Architecture for RAPS 59
4.3.3 . Neural Architecture Search Spaces for DRL 60
4.3.4. Performance Prediction Model for DRL HPOs 61
4.4. Experiment and Result Analysis 71
4.4.1. Machine Learning Performance 72
4.4.2. Random Access KPI Performance 73
5. Remarks 80
5.1. Algorithm Overhead Analysis 80
5.2. Conclusion 82

more