Enhancing Attribute-Factorized Representations in Variational Autoencoder by Regularizing Multiple Mutual Information Elements

Abstract: Recently, there have been many studies on deep generative models that can learn representations of data and generate new samples. We consider learning representations of target attributes and representations of the other attributes and how to disentangle them in deep generative models, and introduce a new Variational AutoEncoder (VAE) based generative model named as MMVAE (Multiple Mutual information elements VAE). The objective function of MMVAE can enhance attribute-factorized representations by regularizing multiple mutual information elements. Specifically, we construct a framework that explicitly regularizes mutual information of each pair among attributes, attribute representations, and the other representations by adopting Mutual Information Neural Estimation (MINE, Belghazi et al., 2018). In the model, the objective function consists of an evidence lower bound and three mutual information regularizers. The formulation corresponds to a minimax game, where a group of parameters in autoencoder is optimized to minimize the objective function while another group in mutual information regularizers is optimized to maximize the objective function. We demonstrate, through a series of experiments on CelebA datasets, that the model can learn the target attribute representations and the other representations in better factorized forms and that these factorized representations are useful for generating images with the target attributes.
최근, 데이터의 표현 (representation)을 학습하고 새로운 샘플을 생성 할 수 있는 심층 생성 모델에 대한 연구가 활발하다. 우리는 특정한 속성 (attribute) 및 다른 속성과 관련된 표현들의 관계를 고려하여, 이들을 심층 생성 모델에서 어떻게 구분하여 처리할지에 대해 고찰하였다. 본 연구에서는 다수의 상호 정보량 (mutual information) 성분을 정규화하여 표현에서 속성의 요소분리를 강화시킬 수 있는, 변분법적 오토인코더 (Variational Autoencoder, VAE) 기반의 새로운 생성 모델 (MMVAE : Multiple mutual information VAE)과 목적 함수를 소개한다. 특히 Mutual Information Neural Estimation (MINE, Belghazi et al., 2018)을 채택하여, 속성의 레이블, 속성 표현 및 다른 표현 사이의 상호 정보량를 명시적으로 정규화하는 프레임 워크를 구성하였다. 이 모델에서 목적 함수는 증거 하한값 (evidence lower bound, ELBO)과 세 개의 상호 정보량으로 구성된다. 이는 미니맥스 게임 (mini-max game)에 해당하는데, 오토인코더의 매개 변수 그룹은 목적 함수를 최소화하도록 최적화되지만 상호 정보량의 매개 변수 그룹은 목적 함수를 최대화하도록 최적화 된다. 우리는 CelebA 데이터 세트에 대한 일련의 실험을 통해, MMVAE가 속성 표현과 다른 표현을 더 잘 구분하여 학습할 수 있고 이렇게 학습된 속성-요소분리된 표현 (attribute-factorized representation)은 주어진 속성을 포함하는 이미지를 생성하는 데 유용하다는 것을 입증하였다.

Language: eng

URI: https://hdl.handle.net/10371/161522

http://dcollection.snu.ac.kr/common/orgView/000000156932

Files in This Item:

000000156932.pdf 1.57 MB

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Transdisciplinary Studies(융합과학부)
  - Theses (Master's Degree_융합과학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share