Leveraging Linguistic Patterns for Neural Sequence Labeling in Natural Language Processing

Abstract: Sequence labeling, which is also known as sequence tagging, is one of the technique for Natural Language Processing (NLP) and Spoken Language Understanding (SLU). NLP and SLU are needed in order to extract information from text. There are renowned examples that are part-of-speech (POS) tagging, chunking, named entity recognition (NER), and supersense tagging in NLP and slot filling in SLU. To solve these tasks, neural network based sequence labeling algorithms show state-of-the-art performance. Most neural sequence labeling models expand their learning capacity by employing additional layers such as character-level layer, jointly training NLP tasks having common knowledge, or augmenting training data. Subsequently, we leverage linguistic patterns using aforementioned approaches.

In this dissertation, we introduce four ways to learn linguistic patterns that helps to decide labels to be annotated. First approach is tagging with a combination of character, subword, and word-level representations, that we incorporate all three types of word-feature units to extract linguistic features. Second approach is joint learning with delexicalization, that we jointly predict words nearby entities and labels to use mutual information. Third one is joint learning with segment-level language modeling, that we exploit segment information to leverage linguistic patterns in a segment. Last one is data generation with Variational Auto-Encoders (VAEs), that we generate data based on original data distribution to expand the training dataset automatically.

The main contributions are summarized as follows: First, we examine distinct characteristics offered by different granularity of input representation. Second, we explore the possibility of improving sequence labeling performance efficiently by utilizing linguistic features. Third, we introduce a new model architecture improves the labeling accuracy of segments which are endmost unit of labeling. Last, we propose a labeled utterance generation model, which minimize human effort, to augment original data which is limited and insufficient. Experimental results demonstrate the advantage of leveraging linguistic patterns for sequence labeling in NLP and SLU.
자연어로 된 문장을 토큰 단위로 처리하여 정보를 추출하기 위해 순차적 레이블링 알고리즘이 사용된다. 순차적 레이블링의 대표적인 예로는 슬롯 인식, 개체명 인식, 품사 태깅, 청킹, 그리고 슈퍼 센스 태깅이 있다. 최근에는 이러한 문제들을 해결하기 위해 신경망 기반의 순차적 레이블링 알고리즘들이 제안되고 있으며, 모델의 학습 능력을 향상시키기 위해 언어적 특성 및 문맥을 활용하는 학습 방법들이 제안되고 있다. 예를 들어, 캐릭터 단위의 히든 레이어를 추가하거나, 공통된 지식을 지니는 다른 자연어 처리 문제와 함께 학습시키는 등의 새로운 목적 함수를 제안하는 방향으로 발전되어 왔다. 특히나 언어 모델과 순차적 레이블링을 함께 학습하는 방법이 제안되고 있다. 그러나 언어 모델을 이용하여 문맥을 학습할 경우에 자주 등장하지 않는 단어에 대해서는 상대적으로 학습 잘 되지 않는 문제가 있다. 레이블 데이터는 구축에 많은 비용이 소요되므로, 그 양이 딥러닝 모델을 학습하는 데에 충분하지 않은 경우가 있다.

본 학위 논문은 언어적 패턴을 활용하여 위 문제들을 해결하는 방법들을 제안한다. 첫 번째 방법은 학습 단위로 캐릭터, 서브워드, 단어를 활용하여 각 단위의 역할을 분석해보고, 조합을 활용하는 모델을 제안한다. 두 번째 방법은 순차적 레이블링과 탈어휘화 문장 생성 공동 학습 기법을 적용한 학습으로, 레이블 주변에 등장할 수 있는 문맥에 집중하여 학습하는 방법을 제안한다. 세 번째 방법은 언어 모델을 세그먼트 단위로 학습할 수 있도록 하여, 언어 모델을 실제 레이블링의 단위인 세그먼트 단위로 계산하여 구 단위의 엔티티에도 강인한 문맥 학습모델을 제안한다. 네 번째는 데이터 증강을 위해 레이블을 활용한 텍스트 생성 알고리즘으로, 딥러닝 모델의 충분한 학습을 돕기 위해 기존의 데이터와 유사한 문장을 생성하여 학습 데이터에 다양성을 부여한다. 실험적으로 우리가 제안하는 언어적 특성과 문맥을 고려하는 방법들이 순차적 레이블링에 효과적임을 밝힌다.

Language: eng

URI: https://hdl.handle.net/10371/162014

http://dcollection.snu.ac.kr/common/orgView/000000157661

Files in This Item:

000000157661.pdf 4.80 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Ph.D. / Sc.D._컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share