Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing

Abstract: Generative adversarial networks (GANs) have been successful in synthesizing and manipulating synthetic but realistic images from latent vectors. However, it is still challenging for GANs to manipulate real images, especially in real-time. State-of-the-art GAN-based methods for editing real images suffer from time-consuming operations in projecting real images to latent vectors. Alternatively, an encoder can be trained to embed real images to the latent space instantly, but it loses details drastically. We propose StyleMapGAN, which adopts a novel representation of latent space, called stylemap, incorporating spatial dimension into embedding. Because each spatial location in the stylemap contributes to its corresponding region of the generated images, the real-time projection through the encoder becomes accurate as well as editing real images becomes spatially controllable. Experimental results demonstrate that our method significantly outperforms state-of-the-art models in various image manipulation tasks such as local editing and image interpolation. Especially, detailed comparisons show that our local editing method successfully reflects not only the color and texture but also the shape of a reference image while preserving untargeted regions.
적대적 생성 신경망(GAN)은 실존하지 않지만, 실제 존재하는 것 같은 이미지들을 생성하는데 성공적으로 이용되고 있다. 또한 각 이미지를 생성하는 잠재 벡터를 이용해 가짜(실존하지 않는) 이미지들을 편집할 수 있다. 그러나 가짜 이미지가 아닌 실제 이미지를 편집하는 것은 어렵고, 특히 실시간으로는 더욱 어렵다. GAN을 이용해 실제 이미지를 편집하는 최첨단 방법들은 실제 이미지를 잠재 벡터로 투영하는 것이 선행되어야 하는데, 이 부분에 많은 시간이 소요된다. 그 대안으로 인코더를 학습해서 실제 이미지를 잠재 공간으로 즉시 임베딩할 수 있지만, 잠재 벡터를 다시 이미지로 복원 시 많은 디테일들을 잃어버린다. 우리는 새로운 형태의 잠재 공간인 stylemap을 가지는 StyleMapGAN을 제안했는데, stylemap은 기존 벡터 공간에 공간적인 차원을 추가한 잠재 공간이다. Stylemap의 각 위치는 생성된 이미지의 해당 지역에 대응되어, 인코더를 이용해 실시간이면서도 정확한 투영이 가능해질 뿐만 아니라, 실제 이미지의 공간적으로 편집이 가능해진다. 많은 실험 결과들은 우리가 제안한 방법이 다양한 이미지 편집 작업들(예를 들어, 국소적인 편집 및 이미지 보간)에서 기존 최첨단 방법들을 월등히 능가함을 보여준다. 특히 자세한 비교 실험들은 우리가 제안한 국소적인 편집 방법이 효과적임을 보여준다. 타겟하지 않는 부분의 기존 이미지는 잘 유지되고, 타겟하는 부분에서는 참조 이미지의 색과 질감뿐만 아니라 모양까지 잘 가져옴을 볼 수 있다.

Language: eng

URI: https://hdl.handle.net/10371/175397

https://dcollection.snu.ac.kr/common/orgView/000000163612

Files in This Item:

000000163612.pdf 3.84 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share