Publications

Detailed Information

Contextualized Language Representations with Deep Neural Networks for Unsupervised Learning : 비지도 학습을 위한 딥 뉴럴 네트워크 기반 문맥화된 언어 표현 연구

DC Field Value Language
dc.contributor.advisor정교민-
dc.contributor.author신중보-
dc.date.accessioned2021-11-30T02:19:21Z-
dc.date.available2021-11-30T02:19:21Z-
dc.date.issued2021-02-
dc.identifier.other000000165301-
dc.identifier.urihttps://hdl.handle.net/10371/175267-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000165301ko_KR
dc.description학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·정보공학부, 2021. 2. 정교민.-
dc.description.abstractIn natural language processing, deep neural networks are powerful language learners since they are able to incorporate context information from raw text data in a flexible way. Language representations learned in an unsupervised manner on a large corpus provide a source for deep neural networks to understand human language better. Natural language understanding has been made remarkable progress with pre-training contextualized language representations using language modeling, which is a representative unsupervised learning or self-supervised learning technique. In contextualized language representation learning, autoregressive language modeling and masked language modeling are two major learning objectives, and state-of-the-art pre-training methods are based on these two tasks. This dissertation presents a novel language modeling task called bidirectional language autoencoding that takes advantage of both of the previous learning objectives. The proposed learning objective enables a model to understand a text in a deep and bidirectional way like masked language modeling, and at the same time, to extract contextualized language representations without fine-tuning like autoregressive language modeling. To learn bidirectional language autoencoding, this dissertation introduces a novel network architecture of a deep bidirectional language model. The presented architecture allows the bidirectional language model to learn useful language representations rather than simply copying and allows each word to have a contextualized representation. The main contribution of this dissertation is the verification that the proposed bidirectional language autoencoding can be a better approach than the previous language modeling tasks when extracting contextualized language representations for natural language understanding tasks. Experimental results are presented on \textit{N}-best list re-ranking, semantic textual similarity, word sense disambiguation, and text classification, demonstrating the advantages of the advanced unsupervised representation learning over previous language modelings.-
dc.description.abstract딥 뉴럴 네트워크는 비정형 텍스트 데이터에서 상황에 맞는 정보를 유연하게 다룰 수 있기 때문에 자연어 처리 분야에서 강력한 도구로 활용되고 있다. 큰 말뭉치에서 감독되지 않는 방법으로 학습한 언어 표현은 딥 뉴럴 네트워크가 문맥 정보를 더 잘 활용할 수 있도록 원천을 제공한다. 자연어 이해는 대표적인 자가지도학습 기술인 언어 모델링을 사용하여 문맥화된 언어 표현을 배움으로써 사전 훈련된 (비 문맥적) 단어 임베딩을 넘어 괄목할 만한 발전을 이루었다. 언어 모델링은 크게 자기회귀 언어 모델링과 마스킹된 언어 모델링으로 분류될 수 있으며, 최신의 사전 훈련 방법들 또한 이 두 언어 모델링에 기반하고 있다. 본 학위 논문은 두 언어 모델링의 장점을 모두 취하는 새로운 언어 모델링 양방향 언어 오토인코딩을 제시한다. 제시된 양방향 언어 오토인코딩은 마스킹된 언어 모델링처럼 깊은 양방향 언어 이해를 가능하게 하고, 동시에 자기회귀 언어 모델링처럼 미세조정없이 문맥화된 언어 표현을 추출해서 사용할 수 있게 한다. 본 학위 논문에서는 양방향 언어 오토인코딩을 학습을 가능하게 하기 위한 새로운 뉴럴 네트워크 구조를 설계한다. 제시된 양방향 언어 모델은 단순한 복사가 아닌 유용한 언어 표현을 학습할 수 있도록 하며, 각각의 단어가 문맥화된 표현을 갖게 하여 정보에 손실이 없도록 한다. 본 논문의 주요 공헌은 새로운 양방향 언어 모델을 제안하여 자연어 이해 문제에 있어서 문맥화된 언어 표현을 추출해서 사용할 때에 기존보다 좋은 방안이 될 수 있음을 검증한 것에 있다. 실험 결과는 \textit{N}-베스트 목록 재순위, 의미론적 텍스트 유사성 검사, 단어 의미 중의성 해소, 그리고 텍스트 분류에 대해 제시되며, 제안된 기법이 이전 기법들 보다 나은 장점을 보여준다.-
dc.description.tableofcontentsAbstract i
Contents iii
List of Tables vii
List of Figures ix

1 Introduction 1
1.1 Overview 1
1.2 Contributions and Outline of This Dissertation 6

2 Background: Language Representation Models 8
2.1 Non-contextualized Word Representations: Word Embeddings 9
2.2 ALM-based Language Representation Models 12
2.2.1 ELMo 13
2.2.2 GPT 14
2.3 MLM-based Language Representation Models 16
2.3.1 BERT 16
2.3.2 Other Language Representation Models 19
2.4 Base Language Model Architecture: SAN 21

3 Masked Language Modeling for Sentence Scoring 26
3.1 Accurate Bidirectional LMs 26
3.1.1 Overview 26
3.1.2 Contributions 27
3.2 Related Works 29
3.2.1 Bidirectional LMs in NLP 29
3.2.2 Bidirectional LMs for ASR 29
3.3 Methodology 30
3.3.1 Architecture of SAN-based LMs 30
3.3.2 Sentence Scoring with SANLMs 32
3.3.3 Re-ranking the N-best List with SANLMs 36
3.4 Experiments 37
3.4.1 Acoustic Model Setups 37
3.4.2 Language Model Setups 38
3.4.3 Results: Re-ranking the N-best List 39
3.4.4 Analysis: Misrecognized Position 41
3.5 Summary of MLM for Sentence Scoring 43

4 Bidirectional Language Autoencoding for Sentence Scoring 44
4.1 Fast and Accurate Bidirectional LMs 44
4.1.1 Overview 44
4.1.2 Contributions 45
4.2 Related Works 47
4.2.1 Bidirectional LMs for Unsupervised Tasks 47
4.2.2 Consideration of Inference Speed 47
4.3 Methodology 48
4.3.1 Baselines: ALM for UniLM and MLM for BiLM 48
4.3.2 BLA: New Language Modeling Objective 49
4.3.3 T-TA: New Deep Bidirectional Language Model 51
4.3.4 Verification of the T-TA Architecture 55
4.3.5 Comparison T-TA with BERT 56
4.4 Experiments 57
4.4.1 Language Model Setups 57
4.4.2 Analysis: Runtime Comparison 59
4.4.3 Settings: Re-ranking the N-best List 61
4.4.4 Results: Re-ranking the N-best List 65
4.4.5 Analysis: Re-ranking and Language models 68
4.5 Summary of BLA for Sentence Scoring 72

5 Bidirectional Language Autoencoding for Feature Extraction 73
5.1 Extracting Contextualized Language Representations 73
5.1.1 Overview 73
5.1.2 Contributions 74
5.2 Related Works 76
5.2.1 Contextualization in Language Representations 76
5.2.2 Word-level VS Sentence-level Representations 76
5.3 Experiments on Unsupervised Learning Tasks 78
5.3.1 Language Model Setups 78
5.3.2 Settings: Unsupervised STS 78
5.3.3 Results: Unsupervised STS 80
5.3.4 Settings: Unsupervised WiC 83
5.3.5 Results: Unsupervised WiC 84
5.4 Experiments on Supervised Learning Tasks 86
5.4.1 Language Model Setups 86
5.4.2 Settings: Text Classification Tasks 87
5.4.3 Results: Feature Extraction 88
5.4.4 Results: Fine-tuning Approach 90
5.5 Summary of BLA for Feature Extraction 92

6 Conclusions and Future Works 93
6.1 Future Works 94

Abstract (In Korean) 109
Acknowlegement 111
-
dc.format.extentx, 111-
dc.language.isokor-
dc.publisher서울대학교 대학원-
dc.subjectdeep neural networks-
dc.subjectlanguage modeling-
dc.subjectunsupervised learning-
dc.subjectcontextualized language representations-
dc.subject딥 뉴럴 네트워크-
dc.subject언어 모델링-
dc.subject비지도 학습-
dc.subject문맥화된 언어 표현-
dc.subject.ddc621.3-
dc.titleContextualized Language Representations with Deep Neural Networks for Unsupervised Learning-
dc.title.alternative비지도 학습을 위한 딥 뉴럴 네트워크 기반 문맥화된 언어 표현 연구-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorSHIN JOONGBO-
dc.contributor.department공과대학 전기·정보공학부-
dc.description.degreeDoctor-
dc.date.awarded2021-02-
dc.identifier.uciI804:11032-000000165301-
dc.identifier.holdings000000000044▲000000000050▲000000165301▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share