Publications

Detailed Information

Korean Language Modeling via Syntactic Guide

Cited 2 time in Web of Science Cited 2 time in Scopus
Authors

Kim, Hyeondey; Kim, Seonhoon; Kang, Inho; Kwak, Nojun; Fung, Pascale

Issue Date
2022
Publisher
EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA
Citation
LREC 2022: THIRTEEN INTERNATIONAL CONFERENCE ON LANGUAGE RESOURCES AND EVALUATION, pp.2841-2849
Abstract
While pre-trained language models play a vital role in modern language processing tasks, but not every language can benefit from them. Most existing research on pre-trained language models focuses primarily on widely-used languages such as English, Chinese, and Indo-European languages. Additionally, such schemes usually require extensive computational resources alongside a large amount of data, which is infeasible for less-widely used languages. We aim to address this research niche by building a language model that understands the linguistic phenomena in the target language which can be trained with low-resources. In this paper, we discuss Korean language modeling, specifically methods for language representation and pre-training methods. With our Korean-specific language representation, we are able to build more powerful models for Korean understanding, even with fewer resources. The paper proposes chunk-wise reconstruction of the Korean language based on a widely used transformer architecture and bidirectional language representation. We also introduce morphological features such as Part-of-Speech (PoS) into the language understanding by leveraging such information during the pre-training. Our experiment results prove that the proposed methods improve the model performance of the investigated Korean language understanding tasks.
URI
https://hdl.handle.net/10371/205566
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • Graduate School of Convergence Science & Technology
  • Department of Intelligence and Information
Research Area Feature Selection and Extraction, Object Detection, Object Recognition

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share