Deep Memory Networks  for Natural Conversations

장하영

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Deep Memory Networks for Natural Conversations

DC Field	Value	Language
dc.contributor.advisor	장병탁	-
dc.contributor.author	장하영	-
dc.date.accessioned	2017-10-27T16:41:30Z	-
dc.date.available	2017-10-27T16:41:30Z	-
dc.date.issued	2017-08	-
dc.identifier.other	000000146388	-
dc.identifier.uri	https://hdl.handle.net/10371/136798	-
dc.description	학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 장병탁.	-
dc.description.abstract	Attention-based models are firstly proposed in the field of computer vision. And then they spread into natural language processing (NLP). The first one successfully bringing in attention mechanism from computer vision to NLP is neural machine translation. Such attention-based mechanism is motivated from that, instead of decoding based on the encoding of a whole and a fixed-length sentence during one pass of neural network-based machine translation, one can attend a specific part of the sentence. This specific part is what should currently be attended. These parts could be words or phrases. The basic problem that the attention mechanism solves is that it allows the network to refer back to the input sequence, instead of forcing it to encode all information into one fixed-length vector. The attention mechanism is simply giving the network access to its internal memory, which is the hidden state of the encoder. In this point of view, instead of choosing what to attend to, the network chooses what to retrieve from memory. Unlike typical memory, the memory access mechanism here is soft, which means that the network retrieves a weighted combination of all memory locations, not a value from a single discrete location. Making the memory access soft has the benefit that we can easily train the network end-to-end using backpropagation The trend towards more complex memory structures is now continuing. End-to-End Memory Networks allow the network to read same input sequence multiple times before making an output, updating the memory contents at each step. For example, answering a question by making multiple reasoning steps over an input story. However, when the networks parameter weights are tied in a certain way, the memory mechanism in End-to-End Memory Networks identical to the attention mechanism presented here, only that it makes multiple hops over the memory. In this dissertation, we propose the deep memory network with attention mechanism and word/sentence embedding for attention mechanism. Due to the external memory and attention mechanism, proposed method can handle various tasks in natural language processing, such as question and answering, machine comprehension and sentiment analysis. Usually attention mechanism requires huge computational cost. In order to solve this problem. I also propose novel word and sentence embedding methods. Previous embedding methods only use the Markov assumption. But if we consider the language structure and make use of it, it will be very helpful to reduce the computational cost. Also it does not need strong supervision which means the additional information on important sentences.	-
dc.description.tableofcontents	Chapter 1. Introduction 1 1.1 Background and Motivation 1 1.2 Approach and Contributions 3 1.3 Organization of the Dissertation 5 Chapter 2. Related Work 7 2.1 Memory Networks 7 2.2 End-to-End Memory Networks 10 2.3 Dynamic Memory Networks 13 Chapter 3. Conceptual Word Embedding 20 3.1 Related Work 20 3.2 Dependency-Gram 22 3.3 Experimental Results 26 3.4 Discussion and Summary 29 Chapter 4. Sentence Embedding using Context 31 4.1 Related Work 31 4.2 CR-Gram 35 4.3 Experimental Results 41 4.4 Discussion and Summary 43 Chapter 5. Deep Memory Networks 46 5.1 Related Work 46 5.2 Deep Memory Networks 48 5.3 Experimental Results 54 5.3.1 bAbI Dataset 54 5.3.2 Stanford Sentiment Treebank 57 5.3.3 SQuAD Dataset 58 5.4 Discussion and Summary 60 Chapter 6. Concluding Remarks 62 6.1 Summary and Discussion 62 6.2 Future Work 65 References 65 초록 76	-
dc.format	application/pdf	-
dc.format.extent	1631783 bytes	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 대학원	-
dc.subject	Attention Model	-
dc.subject	Memory Network	-
dc.subject	Deep Learning	-
dc.subject	Natural Language Understanding	-
dc.subject	Machine Comprehension	-
dc.subject.ddc	621.3	-
dc.title	Deep Memory Networks for Natural Conversations	-
dc.type	Thesis	-
dc.description.degree	Doctor	-
dc.contributor.affiliation	공과대학 전기·컴퓨터공학부	-
dc.date.awarded	2017-08	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Ph.D. / Sc.D._컴퓨터공학부)

Files in This Item:

000000146388.pdf 1.56 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share