Publications

Detailed Information

Automatic Generation of Morpheme Level Reordering Rules for Korean to English Machine Translation

DC Field Value Language
dc.contributor.advisor신효필-
dc.contributor.authorBreanna Castellani-
dc.date.accessioned2017-07-19T09:47:05Z-
dc.date.available2017-07-19T09:47:05Z-
dc.date.issued2017-02-
dc.identifier.other000000141950-
dc.identifier.urihttps://hdl.handle.net/10371/131963-
dc.description학위논문 (석사)-- 서울대학교 대학원 : 언어학과, 2017. 2. 신효필.-
dc.description.abstractWord order is one of the main challenges that machine translation systems must overcome when dealing with any linguistically divergent language pair, such as Korean and English. Statistical machine translation (SMT) models are often insufficient at long distance reordering due the distortion penalties they impose.Rule-based systems, on the other hand, are often costly, in both time and money, to build and maintain.
The present research proposes a new hybrid approach for Korean to English machine translation. While previous approaches have focused on the word, our approach considers the morpheme as the basic unit of translation for this
language pair. We begin by developing a classification model to disambiguate Korean functional morphemes based on alignment pairs and context feature data.
Then, according to our automatically generated rules, we apply this model in a preprocessing step to reorder the morphemes to better match English sentence structure.
After retraining our statistical translation system, Moses, results indicate an improvement in overall translation quality. When the SMT system's internal lexicalized reordering is restricted, we note an increase in the BLEU score of 3.5% over the SMT-only baseline. In the case where we do not limit decoding-time reordering, an even greater BLEU score increase of 4.42% is observed. We also
find evidence to suggest that our changes enable Moses to execute additional reordering operations at decoding time that it was previously unable to perform.
-
dc.description.tableofcontentsChapter 1. Introduction 1
Chapter 2. Literature Review 6
2.1 Machine Translation. 6
2.2 Reordering 10
2.3 Korean to English MT. 12
Chapter 3. Corpus Data and SMT System. 14
3.1 Background 14
3.2 Preparation. 15
3.3 Moses 17
Chapter 4. Rule Generation. 19
4.1 Corpus Processing. 20
4.1.1 Suggested Korean-English Alignments. 21
4.1.2 Feature Sets 24
4.1.3 Reordering Movement. 26
4.2 Rule Creation. 33
4.3 Input Preprocessing. 35
4.3.1 Rule Matching. 35
4.3.2 Morpheme Reordering. 38
4.4 Examples 40
Chapter 5. Results 44
Chapter 6. Conclusion. 49
References 51
Appendix A: Rules 55
Abstract in Korean 64
-
dc.formatapplication/pdf-
dc.format.extent679559 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectautomatic rule generation-
dc.subjectKorean-English MT-
dc.subjecthybrid machine translation-
dc.subjectrule-based preprocessing-
dc.subjectmorpheme reordering-
dc.subject.ddc401-
dc.titleAutomatic Generation of Morpheme Level Reordering Rules for Korean to English Machine Translation-
dc.typeThesis-
dc.description.degreeMaster-
dc.citation.pages69-
dc.contributor.affiliation인문대학 언어학과-
dc.date.awarded2017-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share