암시적 신경 표현을 이용한 적대적 생성 모델의 효율적인 학습 방법

이남우

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

암시적 신경 표현을 이용한 적대적 생성 모델의 효율적인 학습 방법 : Efficient Training of Implicit Neural Representation GANs

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 이남우

Advisor: 유승주

Issue Date: 2022

Publisher: 서울대학교 대학원

Keywords: 적대적 생성 모델 ; 암시적 신경 표현 ; 효율적인 학습 ; 딥 러닝

Description: 학위논문(석사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2022.2. 유승주.

Abstract: 적대적 생성 모델 (GAN)은 그 역사가 매우 짧음에도 불구하고 엄청난 발전을 이루어, 현재는 실존하는 이미지와 구분이 되지 않는 가짜 이미지를 생성할 수 있는 수준에 이르렀다. GAN의 생성 능력 발전과 더불어 데이터 효율적인 GAN 학습에 대한 연구는 많이 이루어지고 있지만, 계산 및 메모리 관점에서 효율적인 GAN 학습에 대한 연구는 이루어지고 있지 않으므로 이는 4K 이상의 초고해상도 이미지 학습 등을 위해서 반드시 연구가 필요한 분야이다. 이러한 연구가 없었던 이유는 현재까지도 흔히 사용되는 3x3 컨볼루션 GAN에서는 학습 방법의 혁신이 어려웠기 때문인데, 최근 암시적 신경 표현 (Implicit Neural Representation, INR)을 GAN에 적용하여 다층 퍼셉트론으로 이루어진 GAN이 하나둘씩 등장하고 있다. INR을 이용한 GAN (INR-GAN)은 이미지의 원하는 부분을 원하는 해상도로 생성할 수 있다는 특징이 있어 3x3 컨볼루션 GAN에서는 시도할 수 없었던 효율적인 학습에 대한 잠재력을 가지고 있다. 따라서 우리는 위 특징을 이용하여 INR-GAN에서 효율적인 학습을 가능케 하는 방법으로 다단계 학습을 제안한다. 또한, 다단계 학습의 성능을 극대화 할 수 있는 다중 스케일 복원 손실함수와 음의 쌍이 없는 대조 손실함수의 사용을 제안한다. 실험 결과는 우리가 제안한 방법을 통해 기존의 학습 방법 대비 ~20%의 MACs, ~20%의 GPU 메모리만 사용하여 같은 FID 성능을 얻어낼 수 있음을 보여준다.
Generative Adversarial Network (GAN) has made tremendous progress despite its very short history, reaching the level of generating fake images that are not distinguished from real ones. Although many researches have been conducted on data-efficient GAN training along with the development of GANs generation capability, researches have not been conducted on efficient GAN training of computational and memory perspective, which is an essential field for image training over 4K. The reason for the absence of these studies is that innovation of learning methods has been difficult in 3x3 convolutional GANs, which are still commonly used. However, GANs made of multi-layer perceptrons have recently emerged one by one by applying implicit neural representation (INR) to GANs. GANs using INR (INR-GAN) has the potential for efficient training that could not be attempted in 3x3 convolutional GANs because it is characterized by being able to create the desired portion of the image at the desired resolution. We propose multi-stage learning as a way to enable efficient training in INR-GAN using the above features. In addition, we propose the use of multi-scale reconstruction loss and contrastive loss without negative pairs that can maximize the performance of multi-stage training. Experimental results show that the same FID performance can be obtained using only ~20% MACs and ~20% GPU memory compared to conventional training methods.

Language: kor

URI: https://hdl.handle.net/10371/183360

https://dcollection.snu.ac.kr/common/orgView/000000170091

Files in This Item:

000000170091.pdf 2.85 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share