A Study on the use of Etymology for Semantic Knowledge Extraction

Pablo Estrada

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

A Study on the use of Etymology for Semantic Knowledge Extraction

DC Field	Value	Language
dc.contributor.advisor	정교민	-
dc.contributor.author	Pablo Estrada	-
dc.date.accessioned	2017-07-19T08:42:42Z	-
dc.date.available	2017-07-19T08:42:42Z	-
dc.date.issued	2016-08	-
dc.identifier.other	000000135984	-
dc.identifier.uri	https://hdl.handle.net/10371/131254	-
dc.description	학위논문 (석사)-- 서울대학교 대학원 : 계산과학전공, 2016. 8. 정교민.	-
dc.description.abstract	Etymology is the study of the composition of words through their historical roots. It is a rich area of study that dates back millennia, and that has contributed significantly to our understanding of human cultures and languages. The field of computational linguistics is a much younger field that grew from the advent of the digital era	-
dc.description.abstract	and that has advanced continuously, even nowadays with the changes brought by Artificial Intelligence and Machine Learning. Computational linguistics have not yet leveraged the knowledge of etymology to its full potential. This work is a step to make etymology another contributor to the field of computational linguistics. In this work we propose a framework to capture the complex etymological relationships that exist in the vocabulary of a human language by creating a complex network that associates words with their historical roots. We then use this framework to obtain insights into the semantics of the words that are part of the Chinese and Korean languages. We run two tasks: one of supervised learning, and one of unsupervised learning, and show that etymology can be effectively used to extract knowledge. We believe that this work helps push etymology into the main stage of computational linguistics, and natural language processing.	-
dc.description.tableofcontents	1 Introduction 6 1.1 Synonym pair classification 7 1.2 Word embedding 9 2 Literature review 11 3 An etymological graph-based framework 16 3.1 Building an Etymological Graph 16 3.2 Obtaining semantic knowledge from the graph 18 4 Two use-cases of the framework 21 4.1 Supervised learning: Finding synonyms though classification 21 4.1.1 The edge classification problem 23 4.1.2 The synonym-link prediction problem 23 4.1.3 Results of the classification schemes 24 4.2 Unsupervised learning: Word embedding with etymology 25 4.2.1 Learning word embeddings 25 4.2.2 Verifying the word embeddings: Synonym discovery 28 4.2.3 Performance of synonym discovery task 28 4.2.4 Computation speed of our model 31 5 Discussion, Conclusion, and Future Work 33 5.1 Discussion 33 5.2 Conclusion and Future Work 34 Bibliography 36 Abstract in Korean 41 Appendices 42 A Unipartite projection 42 B Supervised learning features 43 C Performance of embeddings 44	-
dc.format	application/pdf	-
dc.format.extent	6255635 bytes	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 대학원	-
dc.subject	graph mining	-
dc.subject	etymology	-
dc.subject	computational linguistics	-
dc.subject	chinese language	-
dc.subject.ddc	004	-
dc.title	A Study on the use of Etymology for Semantic Knowledge Extraction	-
dc.type	Thesis	-
dc.description.degree	Master	-
dc.citation.pages	44	-
dc.contributor.affiliation	자연과학대학 협동과정 계산과학전공	-
dc.date.awarded	2016-08	-

Appears in Collections:

College of Natural Sciences (자연과학대학)
- Program in Computational Science and Technology (협동과정-계산과학전공)
  - Theses (Master's Degree_협동과정-계산과학전공)

Files in This Item:

000000135984.pdf 5.97 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share