Publications

Detailed Information

Deep Memory Networks for Natural Conversations

DC Field Value Language
dc.contributor.advisor장병탁-
dc.contributor.author장하영-
dc.date.accessioned2017-10-27T16:41:30Z-
dc.date.available2017-10-27T16:41:30Z-
dc.date.issued2017-08-
dc.identifier.other000000146388-
dc.identifier.urihttps://hdl.handle.net/10371/136798-
dc.description학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 장병탁.-
dc.description.abstractAttention-based models are firstly proposed in the field of computer vision. And then they spread into natural language processing (NLP). The first one successfully bringing in attention mechanism from computer vision to NLP is neural machine translation. Such attention-based mechanism is motivated from that, instead of decoding based on the encoding of a whole and a fixed-length sentence during one pass of neural network-based machine translation, one can attend a specific part of the sentence. This specific part is what should currently be attended. These parts could be words or phrases.
The basic problem that the attention mechanism solves is that it allows the network to refer back to the input sequence, instead of forcing it to encode all information into one fixed-length vector. The attention mechanism is simply giving the network access to its internal memory, which is the hidden state of the encoder. In this point of view, instead of choosing what to attend to, the network chooses what to retrieve from memory. Unlike typical memory, the memory access mechanism here is soft, which means that the network retrieves a weighted combination of all memory locations, not a value from a single discrete location. Making the memory access soft has the benefit that we can easily train the network end-to-end using backpropagation
The trend towards more complex memory structures is now continuing. End-to-End Memory Networks allow the network to read same input sequence multiple times before making an output, updating the memory contents at each step. For example, answering a question by making multiple reasoning steps over an input story. However, when the networks parameter weights are tied in a certain way, the memory mechanism in End-to-End Memory Networks identical to the attention mechanism presented here, only that it makes multiple hops over the memory.
In this dissertation, we propose the deep memory network with attention mechanism and word/sentence embedding for attention mechanism. Due to the external memory and attention mechanism, proposed method can handle various tasks in natural language processing, such as question and answering, machine comprehension and sentiment analysis. Usually attention mechanism requires huge computational cost. In order to solve this problem. I also propose novel word and sentence embedding methods. Previous embedding methods only use the Markov assumption. But if we consider the language structure and make use of it, it will be very helpful to reduce the computational cost. Also it does not need strong supervision which means the additional information on important sentences.
-
dc.description.tableofcontentsChapter 1. Introduction 1
1.1 Background and Motivation 1
1.2 Approach and Contributions 3
1.3 Organization of the Dissertation 5

Chapter 2. Related Work 7
2.1 Memory Networks 7
2.2 End-to-End Memory Networks 10
2.3 Dynamic Memory Networks 13

Chapter 3. Conceptual Word Embedding 20
3.1 Related Work 20
3.2 Dependency-Gram 22
3.3 Experimental Results 26
3.4 Discussion and Summary 29


Chapter 4. Sentence Embedding using Context 31
4.1 Related Work 31
4.2 CR-Gram 35
4.3 Experimental Results 41
4.4 Discussion and Summary 43

Chapter 5. Deep Memory Networks 46
5.1 Related Work 46
5.2 Deep Memory Networks 48
5.3 Experimental Results 54
5.3.1 bAbI Dataset 54
5.3.2 Stanford Sentiment Treebank 57
5.3.3 SQuAD Dataset 58
5.4 Discussion and Summary 60

Chapter 6. Concluding Remarks 62
6.1 Summary and Discussion 62
6.2 Future Work 65


References 65

초록 76
-
dc.formatapplication/pdf-
dc.format.extent1631783 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectAttention Model-
dc.subjectMemory Network-
dc.subjectDeep Learning-
dc.subjectNatural Language Understanding-
dc.subjectMachine Comprehension-
dc.subject.ddc621.3-
dc.titleDeep Memory Networks for Natural Conversations-
dc.typeThesis-
dc.description.degreeDoctor-
dc.contributor.affiliation공과대학 전기·컴퓨터공학부-
dc.date.awarded2017-08-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share