Deep Memory Networks  for Natural Conversations

장하영

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Deep Memory Networks for Natural Conversations

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 장하영

Advisor: 장병탁

Major: 공과대학 전기·컴퓨터공학부

Issue Date: 2017-08

Publisher: 서울대학교 대학원

Keywords: Attention Model ; Memory Network ; Deep Learning ; Natural Language Understanding ; Machine Comprehension

Description: 학위논문 (박사)-- 서울대학교 대학원 공과대학 전기·컴퓨터공학부, 2017. 8. 장병탁.

Abstract: Attention-based models are firstly proposed in the field of computer vision. And then they spread into natural language processing (NLP). The first one successfully bringing in attention mechanism from computer vision to NLP is neural machine translation. Such attention-based mechanism is motivated from that, instead of decoding based on the encoding of a whole and a fixed-length sentence during one pass of neural network-based machine translation, one can attend a specific part of the sentence. This specific part is what should currently be attended. These parts could be words or phrases.
The basic problem that the attention mechanism solves is that it allows the network to refer back to the input sequence, instead of forcing it to encode all information into one fixed-length vector. The attention mechanism is simply giving the network access to its internal memory, which is the hidden state of the encoder. In this point of view, instead of choosing what to attend to, the network chooses what to retrieve from memory. Unlike typical memory, the memory access mechanism here is soft, which means that the network retrieves a weighted combination of all memory locations, not a value from a single discrete location. Making the memory access soft has the benefit that we can easily train the network end-to-end using backpropagation
The trend towards more complex memory structures is now continuing. End-to-End Memory Networks allow the network to read same input sequence multiple times before making an output, updating the memory contents at each step. For example, answering a question by making multiple reasoning steps over an input story. However, when the networks parameter weights are tied in a certain way, the memory mechanism in End-to-End Memory Networks identical to the attention mechanism presented here, only that it makes multiple hops over the memory.
In this dissertation, we propose the deep memory network with attention mechanism and word/sentence embedding for attention mechanism. Due to the external memory and attention mechanism, proposed method can handle various tasks in natural language processing, such as question and answering, machine comprehension and sentiment analysis. Usually attention mechanism requires huge computational cost. In order to solve this problem. I also propose novel word and sentence embedding methods. Previous embedding methods only use the Markov assumption. But if we consider the language structure and make use of it, it will be very helpful to reduce the computational cost. Also it does not need strong supervision which means the additional information on important sentences.

Language: English

URI: https://hdl.handle.net/10371/136798

Files in This Item:

000000146388.pdf 1.56 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Ph.D. / Sc.D._컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share