Contextualized Language Representations with Deep Neural Networks for Unsupervised Learning

Abstract: In natural language processing, deep neural networks are powerful language learners since they are able to incorporate context information from raw text data in a flexible way. Language representations learned in an unsupervised manner on a large corpus provide a source for deep neural networks to understand human language better. Natural language understanding has been made remarkable progress with pre-training contextualized language representations using language modeling, which is a representative unsupervised learning or self-supervised learning technique. In contextualized language representation learning, autoregressive language modeling and masked language modeling are two major learning objectives, and state-of-the-art pre-training methods are based on these two tasks. This dissertation presents a novel language modeling task called bidirectional language autoencoding that takes advantage of both of the previous learning objectives. The proposed learning objective enables a model to understand a text in a deep and bidirectional way like masked language modeling, and at the same time, to extract contextualized language representations without fine-tuning like autoregressive language modeling. To learn bidirectional language autoencoding, this dissertation introduces a novel network architecture of a deep bidirectional language model. The presented architecture allows the bidirectional language model to learn useful language representations rather than simply copying and allows each word to have a contextualized representation. The main contribution of this dissertation is the verification that the proposed bidirectional language autoencoding can be a better approach than the previous language modeling tasks when extracting contextualized language representations for natural language understanding tasks. Experimental results are presented on \textit{N}-best list re-ranking, semantic textual similarity, word sense disambiguation, and text classification, demonstrating the advantages of the advanced unsupervised representation learning over previous language modelings.
딥 뉴럴 네트워크는 비정형 텍스트 데이터에서 상황에 맞는 정보를 유연하게 다룰 수 있기 때문에 자연어 처리 분야에서 강력한 도구로 활용되고 있다. 큰 말뭉치에서 감독되지 않는 방법으로 학습한 언어 표현은 딥 뉴럴 네트워크가 문맥 정보를 더 잘 활용할 수 있도록 원천을 제공한다. 자연어 이해는 대표적인 자가지도학습 기술인 언어 모델링을 사용하여 문맥화된 언어 표현을 배움으로써 사전 훈련된 (비 문맥적) 단어 임베딩을 넘어 괄목할 만한 발전을 이루었다. 언어 모델링은 크게 자기회귀 언어 모델링과 마스킹된 언어 모델링으로 분류될 수 있으며, 최신의 사전 훈련 방법들 또한 이 두 언어 모델링에 기반하고 있다. 본 학위 논문은 두 언어 모델링의 장점을 모두 취하는 새로운 언어 모델링 양방향 언어 오토인코딩을 제시한다. 제시된 양방향 언어 오토인코딩은 마스킹된 언어 모델링처럼 깊은 양방향 언어 이해를 가능하게 하고, 동시에 자기회귀 언어 모델링처럼 미세조정없이 문맥화된 언어 표현을 추출해서 사용할 수 있게 한다. 본 학위 논문에서는 양방향 언어 오토인코딩을 학습을 가능하게 하기 위한 새로운 뉴럴 네트워크 구조를 설계한다. 제시된 양방향 언어 모델은 단순한 복사가 아닌 유용한 언어 표현을 학습할 수 있도록 하며, 각각의 단어가 문맥화된 표현을 갖게 하여 정보에 손실이 없도록 한다. 본 논문의 주요 공헌은 새로운 양방향 언어 모델을 제안하여 자연어 이해 문제에 있어서 문맥화된 언어 표현을 추출해서 사용할 때에 기존보다 좋은 방안이 될 수 있음을 검증한 것에 있다. 실험 결과는 \textit{N}-베스트 목록 재순위, 의미론적 텍스트 유사성 검사, 단어 의미 중의성 해소, 그리고 텍스트 분류에 대해 제시되며, 제안된 기법이 이전 기법들 보다 나은 장점을 보여준다.

Language: kor

URI: https://hdl.handle.net/10371/175267

https://dcollection.snu.ac.kr/common/orgView/000000165301

Files in This Item:

000000165301.pdf 5.30 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Ph.D. / Sc.D._전기·정보공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share