Automatic Generation of Morpheme Level Reordering Rules for Korean to English Machine Translation

Breanna Castellani

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Automatic Generation of Morpheme Level Reordering Rules for Korean to English Machine Translation

DC Field	Value	Language
dc.contributor.advisor	신효필	-
dc.contributor.author	Breanna Castellani	-
dc.date.accessioned	2017-07-19T09:47:05Z	-
dc.date.available	2017-07-19T09:47:05Z	-
dc.date.issued	2017-02	-
dc.identifier.other	000000141950	-
dc.identifier.uri	https://hdl.handle.net/10371/131963	-
dc.description	학위논문 (석사)-- 서울대학교 대학원 : 언어학과, 2017. 2. 신효필.	-
dc.description.abstract	Word order is one of the main challenges that machine translation systems must overcome when dealing with any linguistically divergent language pair, such as Korean and English. Statistical machine translation (SMT) models are often insufficient at long distance reordering due the distortion penalties they impose.Rule-based systems, on the other hand, are often costly, in both time and money, to build and maintain. The present research proposes a new hybrid approach for Korean to English machine translation. While previous approaches have focused on the word, our approach considers the morpheme as the basic unit of translation for this language pair. We begin by developing a classification model to disambiguate Korean functional morphemes based on alignment pairs and context feature data. Then, according to our automatically generated rules, we apply this model in a preprocessing step to reorder the morphemes to better match English sentence structure. After retraining our statistical translation system, Moses, results indicate an improvement in overall translation quality. When the SMT system's internal lexicalized reordering is restricted, we note an increase in the BLEU score of 3.5% over the SMT-only baseline. In the case where we do not limit decoding-time reordering, an even greater BLEU score increase of 4.42% is observed. We also find evidence to suggest that our changes enable Moses to execute additional reordering operations at decoding time that it was previously unable to perform.	-
dc.description.tableofcontents	Chapter 1. Introduction 1 Chapter 2. Literature Review 6 2.1 Machine Translation. 6 2.2 Reordering 10 2.3 Korean to English MT. 12 Chapter 3. Corpus Data and SMT System. 14 3.1 Background 14 3.2 Preparation. 15 3.3 Moses 17 Chapter 4. Rule Generation. 19 4.1 Corpus Processing. 20 4.1.1 Suggested Korean-English Alignments. 21 4.1.2 Feature Sets 24 4.1.3 Reordering Movement. 26 4.2 Rule Creation. 33 4.3 Input Preprocessing. 35 4.3.1 Rule Matching. 35 4.3.2 Morpheme Reordering. 38 4.4 Examples 40 Chapter 5. Results 44 Chapter 6. Conclusion. 49 References 51 Appendix A: Rules 55 Abstract in Korean 64	-
dc.format	application/pdf	-
dc.format.extent	679559 bytes	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 대학원	-
dc.subject	automatic rule generation	-
dc.subject	Korean-English MT	-
dc.subject	hybrid machine translation	-
dc.subject	rule-based preprocessing	-
dc.subject	morpheme reordering	-
dc.subject.ddc	401	-
dc.title	Automatic Generation of Morpheme Level Reordering Rules for Korean to English Machine Translation	-
dc.type	Thesis	-
dc.description.degree	Master	-
dc.citation.pages	69	-
dc.contributor.affiliation	인문대학 언어학과	-
dc.date.awarded	2017-02	-

Appears in Collections:

College of Humanities (인문대학)
- Linguistics (언어학과)
  - Theses (Master's Degree_언어학과)

Files in This Item:

000000141950.pdf 0.65 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share