Generalized Resampling Model for Practical Image Super-Resolution

손상현

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Generalized Resampling Model for Practical Image Super-Resolution : 영상 초해상도 알고리즘의 현실 문제 해결을 위한 일반화된 리샘플링 모델

DC Field	Value	Language
dc.contributor.advisor	이경무	-
dc.contributor.author	손상현	-
dc.date.accessioned	2023-11-20T04:21:44Z	-
dc.date.available	2023-11-20T04:21:44Z	-
dc.date.issued	2023	-
dc.identifier.other	000000178282	-
dc.identifier.uri	https://hdl.handle.net/10371/196432	-
dc.identifier.uri	https://dcollection.snu.ac.kr/common/orgView/000000178282	ko_KR
dc.description	학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2023. 8. 이경무.	-
dc.description.abstract	최근 디스플레이, 카메라, 통신 기술의 발달로 인해 고해상도 영상에 대한 수요와 공급이 많이 늘어났다. 그러나 여전히 네트워크 대역폭이 제한되거나 고화질 원본이 없는 등의 다양한 현실적 상황에서 고품질 영상 콘텐츠를 즐기는 것은 쉽지 않다. 이러한 문제를 해결하기 위해, 영상 초 해상도 기법은 주어진 저해상도의 입력으로부터 고해상도의 영상을 복원하는 것을 목표로 하며, 심층 합성곱 신경망(CNN)을 기반으로 한 최근의 초 해상도 방법들은 낮은 품질의 영상에서 뛰어난 품질의 디테일과 텍스쳐를 성공적으로 복원할 수 있는 성능을 가지고 있다. 하지만, 이러한 알고리즘들이 항상 고품질의 결과물을 보장하는 것은 아니다. 이는 대부분의 초 해상도 방법들이 특정한 응용 상황을 전제하여, 이를 일반화된 현실 문제에 적용하는 것이 어렵기 때문이다. 예를 들어, 대부분의 고성능 단일 이미지 초 해상도 모델들은 저해상도로 합성된 이미지를 고정된 정수배의 배율만큼 늘려 원래의 고해상도 이미지를 복원하는 작업에만 적용될 수 있도록 설계되었다. 이러한 방법들은 합성된 이미지가 아닌 현실의 다양한 저해상도 이미지를 받아 임의의 배율만큼 확대하는 등의 더 실용적이고 일반적인 문제를 푸는 데에 적합하지 않다. 본 학위 논문에서는 기존 영상 초 해상도의 개념을 일반화된 리샘플링으로 확장하여 더욱 다양하고 일반적인 상황에서 실용적으로 사용할 수 있는 방법들을 제안한다. 첫째로, 2장에서는 기존 초 해상도 방법들이 인위적으로 합성된 데이터를 사용하여 학습하기 때문에 현실의 저해상도 이미지에 대한 일반화 성능이 떨어지는 문제를 다룬다. 이를 극복하기 위한 일부 선행 연구의 경우 현실의 저해상도-고해상도 학습 쌍을 획득하기 어렵다거나, 고전적인 쌍 삼차 보간법에 의존한다는 등의 단점들이 있어 범용적으로 적용하는 것이 쉽지 않았다. 따라서, 본 학위 논문에서는 비지도 학습을 통해 데이터 기반으로 다운샘플링 신경망의 목적 함수를 학습할 수 있는 ADL 알고리즘을 제안한다. 제안하는 방법은 정제된 학습 데이터 쌍이나 쌍 삼차 보간법 없이도 다양한 종류의 저해상도 이미지를 사실적으로 모방할 수 있도록 하며, 사전 정의되지 않은 다양한 현실적 상황에 효과적으로 적용할 수 있다. 최종적으로, 이렇게 생성한 이미지들을 활용하여 기존의 초 해상도 모델들을 실제와 더욱 가까운 학습 데이터로 최적화하고 보다 일반화된 상황에서 임의의 입력 이미지에 대해 뛰어난 초 해상도 성능을 얻는다. 3장에서는 다양한 모양의 출력을 만들 수 있는 초 해상도 알고리즘을 제안한다. 현실적인 초 해상도 모델은 사용자의 요구에 맞춰 임의의 배율로 영상을 확대할 수 있어야 하며, 이 개념을 더욱 확장해서 다양한 워핑 작업 또한 수행할 수 있는 것이 바람직하다. 그러나 기존의 방법들은 2배나 4배 등의 정수배 확대만을 수행할 수 있어 현실 문제에 활용하기 어렵다. 따라서, 본 학위 논문에서는 초 해상도 알고리즘을 일반적인 영상 리샘플링으로 확장하는 SRWarp를 제안한다. 이를 위해, 적응형 워핑 층과 다중 배율 혼합 기법을 사용해 이미지 기하 변화에 사용되는 공간적으로 변화하는 연산을 구현하고, 심층 리샘플링 모델 학습을 위한 DIV2KW 데이터 세트 또한 구성한다. 이렇게 구현된 SRWarp는 렌즈 왜곡 보정 등을 포함한, 기존 초 해상도 모델보다 더욱 일반화된 영상 워핑을 수행할 수 있다. 마지막으로, 4장에서는 ADL과 SRWarp를 결합하여 더욱 일반화된 영상 리샘플링 모델을 구현한다. 3장에서 다룬 SRWarp의 경우, 여전히 인위적으로 합성된 데이터를 사용하여 학습되기에 일반화 성능에 한계가 있다. 이에, ADL의 개념을 도입하여 이미지 리샘플링 알고리즘의 적용 범위를 현실의 응용문제까지 확장한다. 구체적으로, 본 학위 논문에서는 임의의 모양을 다룰 수 있는 초 해상도 모델을 별도의 학습 쌍 없이 다양한 합성 및 현실 데이터에 최적화할 수 있는 자기 지도 학습 프레임워크인 SelfWarp를 제안한다. SelfWarp는 자기 지도 및 멱등 손실 함수를 통해, 학습 데이터 쌍을 전혀 사용하지 않고 영상의 섬세한 디테일을 복원할 수 있다. 또한, 폭넓은 실험을 통해 SelfWarp에 적용된 기법들을 검증했으며, 해당 모델이 다양한 종류의 저해상도 영상을 받아 임의의 모양으로 만드는 일반화된 리샘플링 연산을 수행할 수 있는 것을 확인했다. 본 학위 논문에서는 영상 초 해상도 기법을 일반화된 리샘플링 문제로 재정의하고, 이를 해결하기 위한 다양한 방법론을 제안한다. 또한, 광범위한 실험과 정량적, 정성적 분석을 통해 제안하는 알고리즘들이 현실의 영상 리샘플링 문제를 효과적으로 해결할 수 있다는 것을 검증했다. 제안하는 방법을 통해 영상 품질 개선, 보안 감시, 관측 등 다양한 현실 응용 분야에 컴퓨터 비전의 고전적인 문제 중 하나인 영상 초 해상도 기법을 효과적으로 적용할 수 있을 것으로 기대된다.	-
dc.description.abstract	With the rapid development of advanced display, camera, and communication technologies, supply and demand for high-resolution images and videos keep increasing. However, accessing high-quality content can be challenging or even unavailable in practical situations, such as limited network bandwidth, low-light conditions, or playing outdated videos. To overcome these limitations, single image super-resolution (SISR or SR) aims to reconstruct a high-resolution image from the given low-resolution input. Recent advancements in deep CNNs have enabled SR methods to retrieve high-quality details and textures from low-quality images surprisingly well. Nevertheless, existing SR algorithms often fail to guarantee high-quality outputs in real-world scenarios. The primary limitation is that these methods are constructed under less-practical assumptions, making them unsuitable for more generalized situations. Specifically, most state-of-the-art SISR models are formulated to cover synthetic low-resolution images and fixed integer scaling factors. Therefore, they cannot perform well when handling more realistic scenarios, such as taking in-the-wild low-resolution images as inputs or dealing with arbitrary upsampling factors. In this dissertation, we propose a practical solution to apply SR to real-world applications, particularly from the perspective of generalized image resampling. First, in Chapter 2, we address the issue that existing SR models are mainly designed to take bicubic-downsampled images rather than arbitrary low-resolution inputs from the real world. Such a limitation is derived from the difficulty of preparing and collecting realistic training samples for SR. While few learning-based methods aim to generate synthetic training pairs, they are still constrained to the less practical bicubic downsampling formulation. To this end, we propose a novel data-driven framework to construct an Adaptive Data Loss (ADL) for effective unsupervised learning. Rather than rely on bicubic downsampling formulations, our method can simulate latent downsampling models of synthetic and real-world images even without using paired training examples. Consequently, we implement a state-of-the-art SR model by utilizing low-resolution images generated from our novel downsampling network. Next, in Chapter 3, we extend the concept of conventional SR to various output shapes. As a representative resampling algorithm, an ideal SR model is required to perform arbitrary-scale resizing and even image warping. Nevertheless, existing methods mainly focus on fixed integer scaling factors, e.g., X2 or X4, which limits their applicability to diverse real-world scenarios. To extend the scope of SR toward general resampling, we propose SRWarp, a learning-based approach for image warping. SRWarp incorporates an adaptive warping layer and multiscale blending to deal with the spatially-varying property of image transformation. We also introduce the DIV2KW dataset for training the image resampling model. Compared to traditional SR methods, SRWarp enables more generalized image resampling for practical applications, including lens distortion correction. Finally, in Chapter 4, we integrate ADL and SRWarp to develop a generalized image warping algorithm. While SRWarp is still limited to synthetic training data, we leverage the concept of ADL to further extend its scope toward real-world applications. Specifically, we construct a fully self-supervised framework, SelfWarp, to fine-tune the arbitrary-shape SR model on diverse synthetic and real-world data. Based on novel self-supervised and idempotent loss terms, our model can effectively preserve image contents and reconstruct fine details without any paired training data. Extensive analysis justifies the concept of our SelfWarp, which can perform diverse warping operations on arbitrary types of LR images. As one of the classical problems in computer vision, SR has a variety of applications, such as image quality enhancement, surveillance, and observation. While the conventional methods are limited to less practical scenarios, we propose more generalized formulations and methodologies to generalize the concept of SR toward real-world applications from the perspective of image resampling. Extensive studies demonstrate that the proposed solution implements a practical image resampling model, both quantitatively and qualitatively.	-
dc.description.tableofcontents	Abstract i Contents iv List of Tables viii List of Figures x 1 Introduction 1 1.1 Image Super-Resolution: Backgrounds 1 1.1.1 Traditional approaches for image super-resolution 2 1.1.2 Evaluating super-resolution algorithms 4 1.1.3 Deep image super-resolution 4 1.2 Challenges in Image Super-Resolution 6 1.3 Outline of the Dissertation 7 2 Generalizing Super-Resolution Toward Real-World Inputs 9 2.1 Introduction 9 2.2 Related Work 11 2.2.1 Super-resolution for bicubic downsampled images 11 2.2.2 Low-resolution image synthesis for super-resolution 12 2.2.3 Learning to simulate real-world low-resolution images 13 2.2.4 Paired datasets for real-world super-resolution 14 2.3 Learning to Downsample 14 2.3.1 Learning an unknown downsampling process 15 2.3.2 Data constraint in the downsampling model 16 2.3.3 LFL: Data loss over low-frequency components 19 2.3.4 Details about LFL 22 2.3.5 ADL: Adaptive data loss 22 2.3.6 Network architecture 25 2.3.7 Learning super-resolution with our downsampler 27 2.4 LFL and ADL: Experiments 28 2.4.1 Dataset configurations 28 2.4.2 Detailed training configurations 31 2.4.3 Evaluation of the downsampling model 31 2.4.4 Evaluating simulated low-resolution images 32 2.4.5 Super-resolution on the synthetic examples 35 2.4.6 Super-resolution on the RealSR-V3 dataset 38 2.4.7 Ablation study 47 2.4.8 Analysis on ADL 50 2.4.9 Detailed comparison with KernelGAN 50 2.4.10 Stability of LFL and ADL 55 2.4.11 Super-resolution on the real-world images 57 2.5 Conclusion 62 3 Generalizing Super-Resolution Toward Arbitrary Transformations 63 3.1 Introduction 63 3.2 Related Work 66 3.2.1 Regular deep super-resolution 66 3.2.2 Super-resolution for arbitrary resolution 66 3.2.3 Irregular spatial sampling in CNNs 67 3.3 Extending Super-Resolution Toward General Resampling 67 3.3.1 Super-resolution under homography 68 3.3.2 Handling spatially-varying degradations 69 3.3.3 Adaptive resampling by local distortion 70 3.3.4 Mathematical model for the adaptive coordinate system 71 3.3.5 Adaptive kernel estimation from the transformed coordinates 73 3.3.6 Leveraging multiscale knowledge for SRWarp 75 3.3.7 Multiscale warping and blending 76 3.3.8 SRWarp 78 3.4 SRWarp: Experiments 79 3.4.1 Training and evaluation dataset 81 3.4.2 Evaluation metric 83 3.4.3 Detailed training configurations 83 3.4.4 Ablation study 84 3.4.5 Comparison with the other methods 90 3.4.6 Visualizing spatially-varying property 91 3.4.7 Perceptual loss for SRWarp 93 3.4.8 Arbitrary-scale super-resolution with SRWarp 95 3.4.9 Over the homographic transformation 98 3.5 Conclusion 101 4 Generalizing Super-Resolution Toward Image Resampling 103 4.1 Introduction 103 4.2 Related Work 106 4.2.1 Extending image super-resolution toward arbitrary resampling 106 4.2.2 Learning methodologies for real-world super-resolution 106 4.2.3 Self-supervised learning 107 4.3 Learning to Warp without Paired Examples 108 4.3.1 Pre-training of generalized image SR models 108 4.3.2 Self-supervised fine-tuning for general image resampling 110 4.3.3 Analyzing fine-tuning of the warping model 112 4.3.4 SelfWarp: Self-supervised image warping model 112 4.4 SelfWarp: Experiments 113 4.4.1 Datasets and metric 113 4.4.2 Detailed training configurations 114 4.4.3 Quantitative comparison on synthetic datasets 115 4.4.4 Qualitative comparison on synthetic and realistic datasets 116 4.4.5 Ablation study for SelfWarp 116 4.5 Conclusion 121 5 Conclusion 123 5.1 Summary of the Dissertation 123 5.2 Future Works 125 국문 초록 145 감사의 글 149	-
dc.format.extent	xii, 150	-
dc.language.iso	eng	-
dc.publisher	서울대학교 대학원	-
dc.subject	image processing	-
dc.subject	super-resolution	-
dc.subject	image warping	-
dc.subject	unsupervised learning	-
dc.subject	self-supervised learning	-
dc.subject.ddc	621.3	-
dc.title	Generalized Resampling Model for Practical Image Super-Resolution	-
dc.title.alternative	영상 초해상도 알고리즘의 현실 문제 해결을 위한 일반화된 리샘플링 모델	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	Sanghyun Son	-
dc.contributor.department	공과대학 전기·정보공학부	-
dc.description.degree	박사	-
dc.date.awarded	2023-08	-
dc.contributor.major	컴퓨터 비전	-
dc.identifier.uci	I804:11032-000000178282	-
dc.identifier.holdings	000000000050▲000000000058▲000000178282▲	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Ph.D. / Sc.D._전기·정보공학부)

Files in This Item:

000000178282.pdf 47.23 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share