Instance-based Hierarchical Schema Alignment in Linked Data

종남소

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Instance-based Hierarchical Schema Alignment in Linked Data

DC Field	Value	Language
dc.contributor.advisor	김홍기	-
dc.contributor.author	종남소	-
dc.date.accessioned	2017-07-14T05:43:11Z	-
dc.date.available	2017-07-14T05:43:11Z	-
dc.date.issued	2015-08	-
dc.identifier.other	000000056795	-
dc.identifier.uri	https://hdl.handle.net/10371/125082	-
dc.description	학위논문 (박사)-- 서울대학교 대학원 : 치의과학과 의료경영과정보학전공, 2015. 8. 김홍기.	-
dc.description.abstract	Along with the development of Web of documents, there is a natural need for sharing, exchanging, and merging heterogeneous data to provide more comprehensive information and answer users with more complex questions. However, the data published on the Web are raw dumps that sacrifice much of the semantics that can be used for exchanging and integrating data. Resource Description Framework (RDF) and Linked Data are designed to expose the semantics of data by interlinking data represented with well-defined relations. With the profusion of RDF resources and Linked Data, ontology alignment has gained significance in providing highly comprehensive knowledge embedded in disparate sources. Ontology alignment, however, in Linking Open Data (LOD) has traditionally focused more on the instance-level rather than the schema-level. Linked Data supports schema-level matching, provided that instance-level matching is already established. Linked Data is a hotbed for instance-based schema matching, which is considered a better solution for matching classes with ambiguous or obscure names. In this dissertation, the author focuses on three issues in instance-based schema alignment for Linked Data: (1) how to align schemas based on instances, (2) how to scale the schema alignment, (3) how to generate a hierarchical schema structure. Targeting the first issue, the author has proposed an instance-based schema alignment algorithm called IUT. The IUT builds a unified taxonomy for the classes from two ontologies based on an instance-class matrix and obtains the relations of two classes by the common instances. The author tested the IUT with DBpedia and YAGO2, and compared the IUT with two state-of-the-art methods in four alignment tasks. The experiments show that the IUT outperforms the methods in terms of efficiency and effectiveness (e.g., costs 968 ms to obtain 0.810 F-score on intra-subsumption alignment in DBpedia). Targeting the second issue, the author has proposed a scaled version of the IUT called IUT(M). The IUT(M) decreases the computations of the IUT from two aspects based on Locality Sensitive Hashing (LSH): (1) decreasing the similarity computations for each pair of classes with MinHash functions, and (2) decreasing the number of similarity computations with banding. The author tested the IUT(M) with YAGO2-YAGO2 intra-subsumption alignment task to demonstrate that the running time of IUT can be reduced by 94% with a 5% loss in F-score. Targeting the third issue, the author has proposed a method to generate a faceted taxonomy based on object properties on Linked Data. A framework is proposed to build a sub-taxonomy in each facet with sub-data, extracted with an object property, with an Instance-based Concept Taxonomy generation algorithm called ICT. Two experiments demonstrate: (1) The ICT efficiently and effectively generates a sub-taxonomy with rdf:type in DBpedia and YAGO2 (e.g., costs 49 and 11,790 ms to build the concept taxonomies that achieve 0.917 and 0.780 on Taxonomic F-score). (2) The faceted taxonomies for Diseasome and DrugBank, efficiently generated based on multiple object properties (e.g., costs 2,032 and 2,525 ms to build the faceted taxonomies based on 6 and 16 properties), can effectively reduce the search spaces in faceted searches (e.g., obtains 1.65 and 1.03 on Maximum Resolution with 2 facets).	-
dc.description.tableofcontents	1 Introduction 1 1.1 Background and Motivations 1 1.1.1 Data Integration and Schema Alignment 1 1.1.2 From RDF to Linked Data 3 1.1.3 Schema Alignment in Linked Data 5 1.2 Instance-based Schema Alignment 9 1.3 Contributions of this Dissertation 13 1.4 Organization of this Dissertation 15 2 Preliminaries and Related Works 17 2.1 Preliminaries 17 2.1.1 RDF and Linked Data 17 2.1.2 Ontology and Schema Alignment in Linked Data 20 2.2 Related Works 23 2.2.1 Instance-based Schema Alignment 23 2.2.2 Scaling Pairwise Similarity Computations 29 2.2.3 Automatic Taxonomy Generation 32 3 Aligning Schemas with Subsumption and Equivalence Relations 36 3.1 Introduction 36 3.2 Problem Definition 38 3.3 Methods 41 3.3.1 Workflow of Instance-based Schema Alignment 41 3.3.2 Instance-class Matrix Generation 42 3.3.3 Subsumption and Equivalence Relations Discovering 44 3.4 Experiments 48 3.4.1 Schema Alignment Algorithms in Comparison 48 3.4.2 Data and Experiment Design 48 3.5 Results 52 3.5.1 Intra-subsumption Relations for YAGO2-YAGO2 54 3.5.2 Intra-subsumption Relations for DBpedia-DBpedia 58 3.5.3 Inter-Subsumption and Equivalence Relations for YAGO2-DBpedia 61 3.5.4 Effects of χ_s and χ_e for the IUT 67 3.6 Discussions 71 3.7 Conclusion 75 4 Scaling Pair-wise Computations Using the Locality Sensitive Hashing 76 4.1 Introduction 76 4.2 Methods 78 4.2.1 MinHash and Signatures 79 4.2.2 Banding Technique 83 4.2.3 Scaling the IUT with MinHash and Banding 85 4.3 Experiment 87 4.4 Discussions 92 4.5 Conclusion 93 5 Unsupervised Hierarchical Schema Structure Generation in Linked Data 94 5.1 Introduction 94 5.2 Faceted Taxonomy for Linked Data 98 5.3 Framework 101 5.3.1 Facets Extraction 102 5.3.2 Instance Restriction and Redundancy Removal 102 5.3.3 Redundant Object Removal 103 5.3.4 Instance-object Matrix Generation 103 5.4 Generating Faceted Taxonomy 105 5.4.1 The Problem of Generating a Sub-taxonomy for a Facet 105 5.4.2 Concept Definition and Naming 105 5.4.3 Taxonomy Generation Algorithm 108 5.4.4 Instantiation and Taxonomy Refinement 110 5.5 Experiments 112 5.5.1 Task 1-Construction of Taxonomy with rdf:type 112 5.5.2 Task 2-Construction of Multiple Faceted Taxonomies 115 5.6 Results 119 5.6.1 Results of Task 1 119 5.6.2 Results of Task 2 124 5.7 Discussion 131 5.8 Conclusion 133 6 Future Works and Conclusion 134 6.1 Future Works 134 6.1.1 Similarity Measures for Instance-based Schema Alignment 134 6.1.2 Ontology Evolution for Instance-based Schema Alignment 135 6.1.3 Combining the IUT with Structure- and Lexical-based Methods 136 6.1.4 Scaling the IUT with Parallel Computations 137 6.1.5 Faceted Navigation and Search for Linked Data 137 6.2 Conclusion 139 Bibliography 142 초록 152	-
dc.format	application/pdf	-
dc.format.extent	3141673 bytes	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 대학원	-
dc.subject	Schema Alignment	-
dc.subject	Instance-based Matching	-
dc.subject	Linked Data	-
dc.subject	Scaling Alignment	-
dc.subject	Hierarchy Generation	-
dc.subject.ddc	617	-
dc.title	Instance-based Hierarchical Schema Alignment in Linked Data	-
dc.type	Thesis	-
dc.description.degree	Doctor	-
dc.citation.pages	154	-
dc.contributor.affiliation	치의학대학원 치의과학과	-
dc.date.awarded	2015-08	-

Appears in Collections:

College of Dentistry/School of Dentistry (치과대학/치의학대학원)
- Dept. of Dental Science(치의과학과)
  - Theses (Ph.D. / Sc.D._치의과학과)

Files in This Item:

000000056795.pdf 3.00 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share