Publications

Detailed Information

Building Named Entity Knowledge Graph Using Named Entity Normalization : 고유명사 정규화 기법을 이용한 지식 그래프 구축

DC Field Value Language
dc.contributor.advisor조성준-
dc.contributor.author전성환-
dc.date.accessioned2023-06-29T01:52:13Z-
dc.date.available2023-06-29T01:52:13Z-
dc.date.issued2023-
dc.identifier.other000000176653-
dc.identifier.urihttps://hdl.handle.net/10371/193131-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000176653ko_KR
dc.description학위논문(박사) -- 서울대학교대학원 : 공과대학 산업공학과, 2023. 2. 조성준.-
dc.description.abstractText mining aims to extract the information from documents to derive valuable insights. The knowledge graph provides richer information from various documents. Past literature responded for such needs by building technology trees or concept network from the bibliographic information of the documents, or by relying on text mining techniques in order to extract keywords and/or phrases. In this paper, we propose a framework for building a knowledge graph using named entities. The knowledge graph construction framework in this paper satisfies the following conditions: (1) extracting the named entity in the completed form, (2) Building datasets that can be trained and be evaluated by the named entity normalization models in various domains such as finance and technical documents in addition to bio-informatics, where existing NEN research has been active, (3) creating the better performing named entity normalization model, and (4) constructing the knowledge graph by grouping named entities with the same meaning that appear in various forms.-
dc.description.abstract텍스트 마이닝은 다양한 인사이트를 얻기 위해 문서에서 정보를 추출하는 것을 목표로 한다. 문서의 정보를 표현하는 방식 중 하나인 지식 그래프는 다양한 문서에서 더욱 풍부한 정보를 제공한다. 기존 연구들은 텍스트 마이닝 기법을 이용하여 문서의 정보들로 기술 트리 또는 개념 네트워크를 구축하거나 키워드 및 구문을 추출하였다. 본 논문에 서는 고유명사를 이용하여 지식 그래프를 구축하기 위한 프레임워크를 제안한다. 본 논문의 지식 그래프 구축 프레임워크는 다음과 같은 조건을 만족한다. (1) 고유명사를 사람이 이해하기 쉬운 형태로 추출한다. (2) 기존 고유명사 정규화 연구가 활발했던 생물정보학 외에 금융 문서, 반도체 관련 특허 문서에서 추출한 고유명사로 고유명사 정규화 데이터셋을 구축한다. (3) 더 나은 성능의 고유명사 정규화 모델을 구축한다. (4) 다양한 형태의 동일한 의미를 가진 고유명사를 그룹화하여 지식 그래프를 구축한다.-
dc.description.tableofcontentsChapter 1 Introduction 1
Chapter 2 Literature review 5
2.1 Named entity normalization dataset 5
2.2 Named entity normalization 6
2.3 Knowledge graph construction 9
Chapter 3 Dictionary construction for named entity normalization 11
3.1 Background 11
3.2 Dictionary construction methods 12
3.2.1 Finance named entity normalization dataset 12
3.2.2 Patent named entity normalization dataset 18
3.3 Chapter summary 24
Chapter 4 Named entity normalization model using edge weight updating neural network 26
4.1 Background 26
4.2 Proposed model 28
4.2.1 Ground truth entity graph construction 31
4.2.2 Similarity-based entity graph construction 32
4.2.3 Edge weight updating neural network training 35
4.2.4 Edge weight updating neural network inferencing 38
4.3 Experiment results 39
4.3.1 Datasets 39
4.3.2 Experiment settings: named entity normalization in bioinformatics 40
4.3.3 Experiment Settings: Named Entity Normalization in Finance 42
4.4 Results 44
4.4.1 Quantitative Analysis: Bioinformatics 45
4.4.2 QuantitativeAnalysis:Finance 46
4.4.3 QualitativeAnalysis 47
4.5 Chapter summary 51
Chapter 5 Building knowledge graph using named entity recognition and normalization models 53
5.1 Background 53
5.2 Proposed model 55
5.2.1 Named entity normalization 56
5.2.2 Construction of the semiconductor-related patent knowledge graph 61
5.3 Experiment results 62
5.3.1 Comparison models 62
5.3.2 Parameters ettings 64
5.4 Results 64
5.4.1 Quantitative evaluations 64
5.4.2 Qualitative evaluations 70
5.4.3 Knowledge graph visualization and exemplary investigation 71
5.5 Chapter summary 75
Chapter 6 Conclusion 77
6.1 Contributions 77
6.2 Future work 78
Bibliography 79
국문초록 92
감사의 글 93
-
dc.format.extentix, 93-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectNamed entity normalization-
dc.subjectEdge weight updating neural network-
dc.subjectText mining in bioinformatics-
dc.subjectText mining in finance-
dc.subjectText Mining in patent documents-
dc.subjectNamed entity graph-
dc.subjectKnowledge graph-
dc.subjectPatent graph-
dc.subjectKeyword extraction-
dc.subject.ddc670.42-
dc.titleBuilding Named Entity Knowledge Graph Using Named Entity Normalization-
dc.title.alternative고유명사 정규화 기법을 이용한 지식 그래프 구축-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorSung Hwan Jeon-
dc.contributor.department공과대학 산업공학과-
dc.description.degree박사-
dc.date.awarded2023-02-
dc.identifier.uciI804:11032-000000176653-
dc.identifier.holdings000000000049▲000000000056▲000000176653▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share