검색 상세

Accurate Blur Decomposition From a Single Image Using Conditional GANs

초록/요약

Single image blur decomposition, which is also known as image deblurring, is a fundamental task in computer vision with a wide range of practical applications. In this study, we investigate the limitations of existing blur decomposition methods in three specific areas: single-to-single face, single-to-video face, and single-to-video general blur decomposition. We aim to identify the shortcomings in these approaches and propose novel solutions to overcome these limitations. For single-to-single face blur decomposition, previous face deblurring methods have utilized semantic segmentation maps as prior knowledge. Most of these methods generated the segmentation map from a blurred facial image, and restore it using the map in a sequential manner. However, the accuracy of the segmentation affects the restoration performance. Generally, it is difficult to obtain an accurate segmentation map from a blurred image. Instead of sequential methods, we propose an efficient method that learns the flows of facial component restoration without performing segmentation. To this end, we propose a multi-semantic progressive learning (MSPL) framework that progressively restores the entire face image starting from the facial components such as the skin, followed by the hair, and the inner parts (eyes, nose, and mouth). Furthermore, we propose a discriminator that observes the reconstruction-flow of the generator. In addition, we present new test datasets to facilitate the comparison of face deblurring methods. Various experiments demonstrate that the proposed MSPL framework achieves higher performance in facial image deblurring compared to the existing methods, both qualitatively and quantitatively. For single-to-video face blur decomposition, we introduce a novel framework for continuous facial motion deblurring that restores the continuous sharp moment latent in a single motion-blurred face image via a moment control factor. Although a motion-blurred image is the accumulated signal of continuous sharp moments during the exposure time, most existing single image deblurring approaches aim to restore a fixed number of frames using multiple networks and training stages. To address this problem, we propose a continuous facial motion deblurring network based on GAN (CFMD-GAN), which is a novel framework for restoring the continuous moment latent in a single motion-blurred face image with a single network and a single training stage. To stabilize the network training, we train the generator to restore continuous moments in the order determined by our facial motion-based reordering process (FMR) utilizing domain-specific knowledge of the face. Moreover, we propose an auxiliary regressor that helps our generator produce more accurate images by estimating continuous sharp moments. Furthermore, we introduce a control-adaptive (ContAda) block that performs spatially deformable convolution and channel-wise attention as a function of the control factor. Extensive experiments on the 300VW datasets demonstrate that the proposed framework generates a various number of continuous output frames by varying the moment control factor. Compared with the recent single-to-single image deblurring networks trained with the same 300VW training set, the proposed method show the superior performance in restoring the central sharp frame in terms of perceptual metrics, including LPIPS, FID and Arcface identity distance. The proposed method outperforms the existing single-to-video deblurring method for both qualitative and quantitative comparisons. In our experiments on the 300VW test set, the proposed framework reached 33.14 dB and 0.93 for recovery of 7 sharp frames in PSNR and SSIM, respectively. For single-to-video general blur decomposition, while recent studies have proposed methods for extracting latent sharp frames from a single blurred image, they still suffer from limitations on restoring satisfactory images. In addition, most existing methods are limited to decomposing a blurred image into sharp frames with a fixed frame rate. To address these problems, we present a Arbitrary Time Blur Decomposition TripleGAN (ABDGAN) that restores sharp frames with flexible frame rates. Our framework plays a min-max game consisting of a generator, a discriminator and a time-code predictor. The generator serves as a time-conditional deblurring network, while the discriminator and the label predictor provide feedback to generator on producing realistic and sharp image depending on given time code. To provide adequate feedback for the generator, we propose a critic-guided (CG) loss by collaboration of the discriminator and time-code predictor. We also introduce a pairwise order-consistency (POC) loss that imposes a stronger symmetric constraint to improve the restoration accuracy. Our experiments show that our method outperforms previously reported methods in both qualitative and quantitative evaluations.

more

목차

1 Introduction 1
1.1 Background 1
1.2 The Purpose of Thesis 4
1.3 Thesis Structure 6
2 Related Works 7
2.1 Single Image Blur Decomposition 7
2.1.1 Singe-to-Single Generic Blur Decomposition 7
2.1.2 Singe-to-Single Face Blur Decomposition 8
2.1.3 Singe-to-Video Generic Blur Decomposition 9
2.2 Progressive Learning 10
2.3 Conditional Generative Adversarial Networks 10
2.3.1 cGANs based on Supervised Learning 11
2.3.2 cGANs based on Semi-supervised Learning 11
3 Progressive Semantic Face Deblurring 13
3.1 Semantic Progressive Generator 14
3.2 Multi-Semantic Discriminator 17
3.3 Objective Function 18
3.4 Experimental Results 20
3.4.1 Datasets 20
3.4.2 Training Details 22
3.4.3 Evaluation Metrics 23
3.4.4 Ablation Study 24
3.4.5 Comparisons with Existing Methods 28
3.5 Summary 34
4 Continuous Facial Motion Deblurring 35
4.1 Facial Motion-based Reordering 35
4.2 Continuous Facial Motion Deblurring GAN 38
4.2.1 Control-Adaptive Block 41
4.2.2 Discriminator 44
4.3 Model Objectives 44
4.4 Experiments 46
4.4.1 Experimental Setup 46
4.4.2 Comparisons with the state-of-the-arts 49
4.4.3 Analysis on CFMD-GAN 52
4.5 Summary 53
5 ABDGAN: Arbitrary Time Blur Decomposition Using Critic-Guided TripleGAN 57
5.1 Overview of ABDGAN 57
5.2 Time-Conditional Deblurring Network 59
5.3 Pairwise-Order Consistency Loss 62
5.4 Critic-Guided Loss 66
5.5 Training Objectives of ABDGAN 67
5.6 Experiments 73
5.6.1 Implementation Details 73
5.6.2 Datasets 74
5.6.3 Quantitative Comparisons 76
5.6.4 Qualitative Comparisons 78
5.6.5 Ablation Study 78
5.7 Summary 81
6 Conclusions 82

more