Publications
Detailed Information
Bootstrapping information extraction via conceptualization
Cited 2 time in
Web of Science
Cited 4 time in Scopus
- Authors
- Issue Date
- 2021-04
- Publisher
- IEEE
- Citation
- Proceedings - International Conference on Data Engineering, Vol.2021-April, pp.49-60
- Abstract
- © 2021 IEEE.Bootstrapping enables us to use existing knowledge to find patterns and extract new knowledge from free texts, from which more patterns can be found. Due to its minimally supervised, domain-independent, and language-independent nature, it has been widely adopted in real-world applications. However, as iterations go on, semantic drift may happen. The extraction may shift from the target class to other classes and result in errors, which propagate in the succeeding iterations and hurt the performance significantly. Existing solutions simply throw away bad patterns, sacrificing recall to ensure high precision. However, we argue that most of these patterns and instances can be kept as long as being applied selectively, guided by prior knowledge. In this paper, we propose a pattern-based extraction framework with three distinguished features: (1) it uses conceptual taxonomies to guide the extraction to reduce semantic drift; (2) it uses the knowledge of existing triples to improve the precision; (3) it integrates all patterns to form a generalized pattern set with quantified confidence measurement. The proposed solution is applied on enriching two real-world knowledge bases and achieves higher precision and recall compared to existing solutions.
- ISSN
- 1084-4627
- Files in This Item:
- There are no files associated with this item.
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.