검색 상세

Towards Efficient Collaborative Deep Learning Inference for Image-based Sensing Systems

영상 기반 센서 시스템을 위한 효율적 협력 딥러닝 추론 연구

초록/요약

Image-based sensing systems collect images in a real-life environment and analyze the images to extract meaningful context for further application-level decision making. These systems are composed of embedded devices which serve as image capturing sensing units and an inference server to analyze the collected images. This dissertation introduces methods to address the system requirements when implementing image-based sensing systems from the perspective of these different hardware components. The role of embedded devices, which possess computing and energy resource limitations, is typically to collect images and send them to the remote inference server for processing. Here, it is important to consider the battery lifetime of these constraint platforms by controlling on-board computation and communication overhead. As for the inference server, which holds enough computational power, its main role in an image-based sensing system is to apply machine learning (or other inference) models to the images collected from individual embedded devices. High inference accuracy greatly affects the application performance of image-based sensing systems. However, deep learning models that achieve high inference accuracy require a significant amount of computation. Thus, despite the computing resources of a modern server, the computational latency is not simply negligible. The high latency of operating the deep learning model can cause violations of service level objectives which negatively impacts the application service. In this dissertation, we propose methods for supporting efficient collaborative deep learning inference for image-based sensing systems to address such aforementioned challenges. First, we exemplify the system lifetime challenge that commonly occurs in image-based sensing system use cases via a wireless image sensor network (WISN) consisting of resource-constrained embedded devices. This dissertation shows that simple computations at the sensing platforms, such as data compression techniques, can significantly reduce the communication overhead, while still maintaining an acceptable classification performance. Specifically, by applying the proposed image compression scheme to a real-world scenario, animal habitat monitoring, we observe that the amount of transmission data is greatly reduced while minimizing inference accuracy degradation at the inference server. Second, we introduce a challenging application scenario which requires highly accurate inference performance with minimal processing latency. This dissertation proposes a system architecture consisting of a low latency object detection deep learning model with pre- and post-processing mechanisms to achieve the application goals. By applying such an inference process to the video-based safety monitoring system, we show that our proposed system successfully meets given a challenging set of practical (industrial) application-level requirements. Finally, considering the observations from the works introduced above, we propose an adaptive computation offloading decision scheme for collaborative inference between a mobile GPU-based embedded platform and modern deep learning servers. Specifically, we design a latency-aware collaborative execution technique by accurately identifying the load status of the deep learning server and the communication status. The works presented in this dissertation can be used as guidelines in addressing some of the core challenges in designing practically applicable image-based sensing systems, from extending individual device lifetime to simultaneously improving the inference accuracy and latency.

more

목차

Chapter 1. Introduction 1
1.1 Background & Motivation 1
1.2 Contribution of this Dissertation 4
1.3 Overview of the Dissertation 5
Chapter 2. Data compression scheme for low-power embedded devices in image-based sensing systems 7
2.1 Scenario and System Design 8
2.2 Decreasing Resolution 12
2.2.1 Color Quantization 13
2.2.2 Image resize 16
2.3 Image Classification 22
2.3.1 Convolutional neural networks for image classification 23
2.3.2 Filtering corrupted images 24
2.3.3 Building and evaluating the CNN model 25
2.4 Related Work 30
2.5 Discussion 32
2.6 Summary 32
Chapter 3. Designing low latency inference server for high-accuracy required sensing system 34
3.1 Introduction 35
3.2 Related Work 38
3.2.1 Object Detection Algorithms 38
3.2.2 Safety Monitoring Applications 40
3.2.3 Multi-stage Anomaly Detection Systems 43
3.3 Background 44
3.3.1 Factory Assembly Lines and Threats to Workers 44
3.3.2 System Requirements 45
3.4 Overview of SafeFac Components 49
3.5 Human Presence Detection Module in SafeFac 52
3.5.1 Image pre-processing subcomponent 52
3.5.2 Human Object Detecting Deep Learning Model 53
3.5.3 Post-processing Module 57
3.5.4 Adaptive Camera Scheduling 61
3.6 Evaluation 62
3.6.1 Evaluation Dataset 62
3.6.2 Detection Latency 65
3.6.3 Detection Accuracy 67
3.7 Discussions and Future Research 68
3.8 Summary 71
Chapter 4. Deep learning computation offloading decision scheme for image-based sensing systems 77
4.1 Introduction 78
4.2 Background and Related Work 82
4.2.1 Deep Learning Model Computation Offloading 82
4.2.2 Deep Learning Inference Server Designs 84
4.3 Latency Impacting Factors 85
4.3.1 Experimental Environment 85
4.3.2 Impact of Server Load 86
4.3.3 Impact of Network Conditions 87
4.3.4 Impact of Batch Inference 87
4.4 Diamond Overview 90
4.5 DIAMOND Server Design 91
4.5.1 Layer-centric Batch Processing 91
4.5.2 Server Load Monitor 94
4.6 Profiling and Latency Prediction 94
4.6.1 Offline Profiling 94
4.6.2 Online Profiling 95
4.7 Run-time Partitioning Decisions 98
4.8 Evaluation 99
4.8.1 Experimental Setup 99
4.8.2 Performance Comparison with Naive Approach 100
4.8.3 Impact of Server Load Dynamics 101
4.8.4 Impact of Communication Quality Dynamics 103
4.8.5 Impact of Probability Threshold 105
4.8.6 Serving Various DNN Architectures 106
4.9 Summary 106
Chapter 5. Conclusion 116
5.1 Future work 117
Bibliography 121

more