Publications

Detailed Information

Bootstrapping information extraction via conceptualization

Cited 2 time in Web of Science Cited 4 time in Scopus
Authors

Liang, Jiaqing; Feng, Suo; Xie, Chenhao; Xiao, Yanghua; Chen, Jindong; Seung-Won, Hwang

Issue Date
2021-04
Publisher
IEEE
Citation
Proceedings - International Conference on Data Engineering, Vol.2021-April, pp.49-60
Abstract
© 2021 IEEE.Bootstrapping enables us to use existing knowledge to find patterns and extract new knowledge from free texts, from which more patterns can be found. Due to its minimally supervised, domain-independent, and language-independent nature, it has been widely adopted in real-world applications. However, as iterations go on, semantic drift may happen. The extraction may shift from the target class to other classes and result in errors, which propagate in the succeeding iterations and hurt the performance significantly. Existing solutions simply throw away bad patterns, sacrificing recall to ensure high precision. However, we argue that most of these patterns and instances can be kept as long as being applied selectively, guided by prior knowledge. In this paper, we propose a pattern-based extraction framework with three distinguished features: (1) it uses conceptual taxonomies to guide the extraction to reduce semantic drift; (2) it uses the knowledge of existing triples to improve the precision; (3) it integrates all patterns to form a generalized pattern set with quantified confidence measurement. The proposed solution is applied on enriching two real-world knowledge bases and achieves higher precision and recall compared to existing solutions.
ISSN
1084-4627
URI
https://hdl.handle.net/10371/183751
DOI
https://doi.org/10.1109/ICDE51399.2021.00012
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share