Korean Sentence Complexity Reduction for Machine Translation

Luke Bates

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Korean Sentence Complexity Reduction for Machine Translation

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: Luke Bates

Advisor: 신효필

Major: 인문대학 언어학과

Issue Date: 2017-02

Publisher: 서울대학교 대학원

Keywords: text simplification ; machine translation ; korean ; natural language processing ; neural networks

Description: 학위논문 (석사)-- 서울대학교 대학원 : 언어학과, 2017. 2. 신효필.

Abstract: Text simplification used as a preprocessing task for the improved functionality of natural language processing systems has a long history of research based on European languages, yet, there is no research that has utilized Korean as the object of study. However, there is great demand for comprehensible Korean to English machine translations, yet due to the disparate nature of these two languages, machine translation often fails to achieve fluent results.
In order to improve the translation quality of Korean text as the source language, the first-ever rule-based Korean complexity reduction system was designed, constructed, and implemented in this study. This system was achieved by a unique technique termed "phrase-grouping and generalization of nuance structures," in Korean as a disambiguation tool. This technique has potential applications in all languages and additional natural language processing tasks. On top of this, in order to set a foundation for which complexity reduction operations and combinations generate fluent Korean and improved machine translation output, a unique factorial approach to simplification generation was also implemented.
In order to assess the output of the system proposed in the current research, the parallel evaluation of simplified Korean text by Korean native speakers and the evaluation of translations by English native speakers was conducted. The translation systems used in this study were Google Translate and Moses, both statistical machine translation systems, and Naver Translate, a neural machine translation system. This is the first research to conduct experiments on the interaction of text simplification and neural networks. Additionally, no known research has analyzed output from three machine translation systems simultaneously.
Generally, the proposed system generated relatively fluent Korean, though due to the factorial nature by which simplifications were generated, sentence quality usually began to deteriorate after more than one simplification operation. On the other hand, the proposed system as a preprocessing task for machine translation consistently improved translation quality for all three systems utilized in this study by up to two performed simplifications.
In the case of the statistical machine translation systems used in this study, more than two simplifications deteriorated not only Korean sentence quality, but also translation quality. However, in the case of Naver Translate, the neural machine translation system used in this study, even three simplifications resulted in translation improvement according to the evaluators. This study, then, emphasizes the need for more research conducted on text simplification as the field of natural language processing transitions to neural network-based approaches and applications.

Language: English

URI: https://hdl.handle.net/10371/131962

Files in This Item:

000000141927.pdf 2.21 MB

Appears in Collections:

College of Humanities (인문대학)
- Linguistics (언어학과)
  - Theses (Master's Degree_언어학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share