Publications
Detailed Information
Building Named Entity Knowledge Graph Using Named Entity Normalization : 고유명사 정규화 기법을 이용한 지식 그래프 구축
DC Field | Value | Language |
---|---|---|
dc.contributor.advisor | 조성준 | - |
dc.contributor.author | 전성환 | - |
dc.date.accessioned | 2023-06-29T01:52:13Z | - |
dc.date.available | 2023-06-29T01:52:13Z | - |
dc.date.issued | 2023 | - |
dc.identifier.other | 000000176653 | - |
dc.identifier.uri | https://hdl.handle.net/10371/193131 | - |
dc.identifier.uri | https://dcollection.snu.ac.kr/common/orgView/000000176653 | ko_KR |
dc.description | 학위논문(박사) -- 서울대학교대학원 : 공과대학 산업공학과, 2023. 2. 조성준. | - |
dc.description.abstract | Text mining aims to extract the information from documents to derive valuable insights. The knowledge graph provides richer information from various documents. Past literature responded for such needs by building technology trees or concept network from the bibliographic information of the documents, or by relying on text mining techniques in order to extract keywords and/or phrases. In this paper, we propose a framework for building a knowledge graph using named entities. The knowledge graph construction framework in this paper satisfies the following conditions: (1) extracting the named entity in the completed form, (2) Building datasets that can be trained and be evaluated by the named entity normalization models in various domains such as finance and technical documents in addition to bio-informatics, where existing NEN research has been active, (3) creating the better performing named entity normalization model, and (4) constructing the knowledge graph by grouping named entities with the same meaning that appear in various forms. | - |
dc.description.abstract | 텍스트 마이닝은 다양한 인사이트를 얻기 위해 문서에서 정보를 추출하는 것을 목표로 한다. 문서의 정보를 표현하는 방식 중 하나인 지식 그래프는 다양한 문서에서 더욱 풍부한 정보를 제공한다. 기존 연구들은 텍스트 마이닝 기법을 이용하여 문서의 정보들로 기술 트리 또는 개념 네트워크를 구축하거나 키워드 및 구문을 추출하였다. 본 논문에 서는 고유명사를 이용하여 지식 그래프를 구축하기 위한 프레임워크를 제안한다. 본 논문의 지식 그래프 구축 프레임워크는 다음과 같은 조건을 만족한다. (1) 고유명사를 사람이 이해하기 쉬운 형태로 추출한다. (2) 기존 고유명사 정규화 연구가 활발했던 생물정보학 외에 금융 문서, 반도체 관련 특허 문서에서 추출한 고유명사로 고유명사 정규화 데이터셋을 구축한다. (3) 더 나은 성능의 고유명사 정규화 모델을 구축한다. (4) 다양한 형태의 동일한 의미를 가진 고유명사를 그룹화하여 지식 그래프를 구축한다. | - |
dc.description.tableofcontents | Chapter 1 Introduction 1
Chapter 2 Literature review 5 2.1 Named entity normalization dataset 5 2.2 Named entity normalization 6 2.3 Knowledge graph construction 9 Chapter 3 Dictionary construction for named entity normalization 11 3.1 Background 11 3.2 Dictionary construction methods 12 3.2.1 Finance named entity normalization dataset 12 3.2.2 Patent named entity normalization dataset 18 3.3 Chapter summary 24 Chapter 4 Named entity normalization model using edge weight updating neural network 26 4.1 Background 26 4.2 Proposed model 28 4.2.1 Ground truth entity graph construction 31 4.2.2 Similarity-based entity graph construction 32 4.2.3 Edge weight updating neural network training 35 4.2.4 Edge weight updating neural network inferencing 38 4.3 Experiment results 39 4.3.1 Datasets 39 4.3.2 Experiment settings: named entity normalization in bioinformatics 40 4.3.3 Experiment Settings: Named Entity Normalization in Finance 42 4.4 Results 44 4.4.1 Quantitative Analysis: Bioinformatics 45 4.4.2 QuantitativeAnalysis:Finance 46 4.4.3 QualitativeAnalysis 47 4.5 Chapter summary 51 Chapter 5 Building knowledge graph using named entity recognition and normalization models 53 5.1 Background 53 5.2 Proposed model 55 5.2.1 Named entity normalization 56 5.2.2 Construction of the semiconductor-related patent knowledge graph 61 5.3 Experiment results 62 5.3.1 Comparison models 62 5.3.2 Parameters ettings 64 5.4 Results 64 5.4.1 Quantitative evaluations 64 5.4.2 Qualitative evaluations 70 5.4.3 Knowledge graph visualization and exemplary investigation 71 5.5 Chapter summary 75 Chapter 6 Conclusion 77 6.1 Contributions 77 6.2 Future work 78 Bibliography 79 국문초록 92 감사의 글 93 | - |
dc.format.extent | ix, 93 | - |
dc.language.iso | eng | - |
dc.publisher | 서울대학교 대학원 | - |
dc.subject | Named entity normalization | - |
dc.subject | Edge weight updating neural network | - |
dc.subject | Text mining in bioinformatics | - |
dc.subject | Text mining in finance | - |
dc.subject | Text Mining in patent documents | - |
dc.subject | Named entity graph | - |
dc.subject | Knowledge graph | - |
dc.subject | Patent graph | - |
dc.subject | Keyword extraction | - |
dc.subject.ddc | 670.42 | - |
dc.title | Building Named Entity Knowledge Graph Using Named Entity Normalization | - |
dc.title.alternative | 고유명사 정규화 기법을 이용한 지식 그래프 구축 | - |
dc.type | Thesis | - |
dc.type | Dissertation | - |
dc.contributor.AlternativeAuthor | Sung Hwan Jeon | - |
dc.contributor.department | 공과대학 산업공학과 | - |
dc.description.degree | 박사 | - |
dc.date.awarded | 2023-02 | - |
dc.identifier.uci | I804:11032-000000176653 | - |
dc.identifier.holdings | 000000000049▲000000000056▲000000176653▲ | - |
- Appears in Collections:
- Files in This Item:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.