Variational Learning for A Hierarchical Model of Conversations

박유군

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Variational Learning for A Hierarchical Model of Conversations

DC Field	Value	Language
dc.contributor.advisor	김건희	-
dc.contributor.author	박유군	-
dc.date.accessioned	2019-05-07T03:19:40Z	-
dc.date.available	2019-05-07T03:19:40Z	-
dc.date.issued	2019-02	-
dc.identifier.other	000000153687	-
dc.identifier.uri	https://hdl.handle.net/10371/150807	-
dc.description	학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2019. 2. 김건희.	-
dc.description.abstract	계층적 회귀신경망과 (hierarchical RNNS) 결합된 Variational autoencoders (VAE) 는 대화 모델링을 위한 강력한 프레임워크를 제공한다. 그러나 이러한 모델은 잠재변수 (latent variable)을 무시하도록 학습되는 degeneration 문제를 겪는다. 우리는 실험적으로 이 문제에 크게 2가지 원인이 있는 것을 밝힌다. 첫째, 계층적 회귀신경망의 자기회귀적 (autoregressive) 분포 추정 능력이 매우 강력하기 때문에 잠재변수에 의존하지 않고도 데이터를 모델링 할 수 있다. 둘째, 문맥에 의존하는 conditional VAE 구조는 대화 문맥이 완전하게 주어지기 때문에 다음 발화를 거의 결정론적으로 추론할 수 있으며, 따라서 계층적 회귀신경망은 쉽게 학습 데이터에 과적합 (overfit) 할 수 있다. 이 문제를 해결하기 위하며 우리는 Variational Hierarchical Conversation RNNs (VHCR) 이라는 계층적 모델을 제시한다. 이 모델은 1) 잠재변수의 계층적 구조를 사용하는 것, 2) utterance drop regularization 을 사용하는 것의 2가지 중요한 아이디어를 활용한다. Cornel Move Dialog 와 Ubuntu Dialog Corpus 의 2가지 데이터셋에서 우리는 실험적으로 이 모델이 기존의 state-of-the-art 성능을 갱신하는 것을 보인다. 또한, 계층적인 잠재변수 구조는 대화 내의 발화 내용의 제어를 새로운 측면에서 가능케 한다.	-
dc.description.abstract	Variational autoencoders (VAE) combined with hierarchical RNNs have emerged as a powerful framework for conversation modeling. However, they suffer from the notorious degeneration problem, where the RNN decoders learn to ignore latent variables and reduce to vanilla RNNs. We empirically show that this degeneracy occurs mostly due to two reasons. First, the expressive power of hierarchical RNN decoders is often high enough to model the data using only its decoding distributions without relying on the role of latent variables to capture variability of data. Second, the context-conditional VAE structure whose utterance generation process is conditioned on the current context of conversation, deprives training targets of variability	-
dc.description.abstract	that is, target utterances in the training corpus can be deterministically deduced from the context, making the RNN decoders prone to overfitting given their expressive power. To solve the degeneration problem, we propose a novel hierarchical model named Variational Hierarchical Conversation RNNs (VHCR), involving two key ideas of (1) using a hierarchical structure of latent variables, and (2) exploiting an utterance drop for regularization of hierarchical RNNs. With evaluations on two datasets of Cornell Movie Dialog and Ubuntu Dialog Corpus, we show that our VHCR successfully utilizes latent variables and outperforms state-of-the-art models for conversation generation. Moreover, it can perform several new utterance control tasks, thanks to its hierarchical latent structure.	-
dc.description.tableofcontents	Abstract i Contents iii List of Figures v List of Tables vii Chapter 1 Introduction 1 Chapter 2 Related Works 5 2.1 Conversation Modeling . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Degeneracy of Variational Autoencoders . . . . . . . . . . . . . . 6 Chapter 3 Approach 7 3.1 Preliminary: Variational Autoencoder . . . . . . . . . . . . . . . 7 3.2 VHRED . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 The Degeneration Problem . . . . . . . . . . . . . . . . . . . . . 10 3.4 Empirical Observation on Degeneracy . . . . . . . . . . . . . . . 12 3.5 Variational Hierarchical Conversation RNN (VHCR) . . . . . . . 14 3.6 Effectiveness of Hierarchical Latent Structure . . . . . . . . . . . 17 Chapter 4 Results 19 4.1 Experimental Setting . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.1.2 Baselines . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 4.1.3 Performance Measures . . . . . . . . . . . . . . . . . . . . 20 4.1.4 Implementation Details . . . . . . . . . . . . . . . . . . . 20 4.1.5 Human Evaluation . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Results of Negative Log-likelihood . . . . . . . . . . . . . . . . . 21 4.3 Results of Embedding-Based Metrics . . . . . . . . . . . . . . . . 23 4.4 Results of Human Evaluation . . . . . . . . . . . . . . . . . . . . 25 4.5 Qualitative Analyses . . . . . . . . . . . . . . . . . . . . . . . . . 26 4.5.1 Comparison of Predicted Responses . . . . . . . . . . . . 26 4.5.2 Interpolation on Conversation Latent Variable . . . . . . 26 4.5.3 Generation with Fixed Conversation Latent Variable . . . 27 Chapter 5 Conclusion 28 요약 32 Acknowledgements 33	-
dc.language.iso	eng	-
dc.publisher	서울대학교 대학원	-
dc.subject.ddc	621.39	-
dc.title	Variational Learning for A Hierarchical Model of Conversations	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.description.degree	Master	-
dc.contributor.affiliation	공과대학 컴퓨터공학부	-
dc.date.awarded	2019-02	-
dc.identifier.uci	I804:11032-000000153687	-
dc.identifier.holdings	000000000026▲000000000039▲000000153687▲	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Files in This Item:

000000153687.pdf 2.39 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share