Publications
Detailed Information
A Study on the use of Etymology for Semantic Knowledge Extraction
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 정교민 | - |
dc.contributor.author | Pablo Estrada | - |
dc.date.accessioned | 2017-07-19T08:42:42Z | - |
dc.date.available | 2017-07-19T08:42:42Z | - |
dc.date.issued | 2016-08 | - |
dc.identifier.other | 000000135984 | - |
dc.identifier.uri | https://hdl.handle.net/10371/131254 | - |
dc.description | 학위논문 (석사)-- 서울대학교 대학원 : 계산과학전공, 2016. 8. 정교민. | - |
dc.description.abstract | Etymology is the study of the composition of words through their historical roots. It is a rich area of study that dates back millennia, and that has contributed significantly to our understanding of human cultures and languages. The field of computational linguistics is a much younger field that grew from the advent of the digital era | - |
dc.description.abstract | and that has advanced continuously, even nowadays with the changes brought by Artificial Intelligence and Machine Learning. Computational linguistics have not yet leveraged the knowledge of etymology to its full potential. This work is a step to make etymology another contributor to the field of computational linguistics.
In this work we propose a framework to capture the complex etymological relationships that exist in the vocabulary of a human language by creating a complex network that associates words with their historical roots. We then use this framework to obtain insights into the semantics of the words that are part of the Chinese and Korean languages. We run two tasks: one of supervised learning, and one of unsupervised learning, and show that etymology can be effectively used to extract knowledge. We believe that this work helps push etymology into the main stage of computational linguistics, and natural language processing. | - |
dc.description.tableofcontents | 1 Introduction 6
1.1 Synonym pair classification 7 1.2 Word embedding 9 2 Literature review 11 3 An etymological graph-based framework 16 3.1 Building an Etymological Graph 16 3.2 Obtaining semantic knowledge from the graph 18 4 Two use-cases of the framework 21 4.1 Supervised learning: Finding synonyms though classification 21 4.1.1 The edge classification problem 23 4.1.2 The synonym-link prediction problem 23 4.1.3 Results of the classification schemes 24 4.2 Unsupervised learning: Word embedding with etymology 25 4.2.1 Learning word embeddings 25 4.2.2 Verifying the word embeddings: Synonym discovery 28 4.2.3 Performance of synonym discovery task 28 4.2.4 Computation speed of our model 31 5 Discussion, Conclusion, and Future Work 33 5.1 Discussion 33 5.2 Conclusion and Future Work 34 Bibliography 36 Abstract in Korean 41 Appendices 42 A Unipartite projection 42 B Supervised learning features 43 C Performance of embeddings 44 | - |
dc.format | application/pdf | - |
dc.format.extent | 6255635 bytes | - |
dc.format.medium | application/pdf | - |
dc.language.iso | en | - |
dc.publisher | 서울대학교 대학원 | - |
dc.subject | graph mining | - |
dc.subject | etymology | - |
dc.subject | computational linguistics | - |
dc.subject | chinese language | - |
dc.subject.ddc | 004 | - |
dc.title | A Study on the use of Etymology for Semantic Knowledge Extraction | - |
dc.type | Thesis | - |
dc.description.degree | Master | - |
dc.citation.pages | 44 | - |
dc.contributor.affiliation | 자연과학대학 협동과정 계산과학전공 | - |
dc.date.awarded | 2016-08 | - |
- Appears in Collections:
- Files in This Item:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.