Effective Training Methods for Autoregressive Text Generation

김양훈

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Effective Training Methods for Autoregressive Text Generation : 자기회귀모델 기반 텍스트 생성을 위한 효과적인 학습 방법에 관한 연구

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 김양훈

Advisor: 정교민

Issue Date: 2021

Publisher: 서울대학교 대학원

Keywords: Deep Neural Networks ; Text Generation ; Question Generation ; 딥 뉴럴넷 ; 텍스트 생성 ; 질문 생성

Description: 학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2021.8. 김효석.

Abstract: The rise of deep neural networks has promoted tremendous advances in natural language processing research. Natural language generation is a subfield of natural language processing, which is inevitable in building a human-like artificial intelligence since they take responsibility for delivering the decision-making of machines in natural language. For neural network-based text generation techniques, which have achieved most state-of-the-art performance, autoregressive methods are generally adapted because of their correspondence to the word-by-word nature of human language production. In this dissertation, we investigate two different ways to train autoregressive text generation models, which are based on deep neural networks. We first focus on a token-level training of question generation, which aims to generate a question related to a given input passage. The proposed Answer-Separated Seq2Seq effectively mitigates a problem from the previous question generation models that a significant proportion of the generated questions include words in the target answer. While autoregressive methods are primarily trained with maximum likelihood estimation, they suffer from several problems, such as exposure bias. As a remedy, we propose a sequence-level GAN-based approach for text generation that promotes collaborative training in both continuous and discrete representations of text. To aggregate the achievement of the research mentioned above, we finally propose a novel way of training a sequence-level question generation model, adopting a pre-trained language model, one of the most significant breakthroughs in natural language processing, along with Proximal Policy Optimization.
자연어 처리 연구는 딥 뉴럴넷의 도입으로 인해 대대적인 발전을 거쳤다. 자연어 처리 연구의 일종인 자연어 생성은 기계가 내린 결정을 사람이 이해할 수 있도록 전달하는 기능이 있다, 그렇기에 사람을 모방하는 인공지능 시스템을 구축하는 데에 있어 필수 불가결한 요소이다. 일반적으로 뉴럴넷 기반의 텍스트 생성 태스크에서는 자동회귀 방법론들이 주로 사용되는데, 이는 사람의 언어 생성 과정과 유사한 양상을 띠기 때문이다. 본 학위 논문에서는 두 가지 뉴럴넷 기반의 자동회귀 텍스트 생성 모델 학습 기법에 대해 제안한다. 첫 번째 방법론에서는 토큰 레벨에서의 질문 생성 모델 학습 방법에 대해 소개한다. 논문에서 제안하는 답변 분리 시퀀스-투-시퀀스 모델은 기존에 존재하는 질문 생성 모델로 생성된 질문이 답변에 해당하는 내용을 포함하는 문제점을 효과적으로 해결한다. 주로 최대 우도 추정법을 통해 학습되는 자동회귀 방법론에는 노출 편향 등과 같은 문제점이 존재한다. 이러한 문제점을 해결하기 위해 논문에서는 텍스트의 연속 공간 표현과 이산 공간 표현 모두에 대해 상호보완적으로 학습하는 시퀀스 레벨의 적대 신경망 기반의 텍스트 생성 기법을 제안한다. 마지막으로 앞선 방법론들을 종합하여 시퀀스 레벨의 질문 생성기법을 제안하며, 이러한 과정에서 최신 자연어 처리 방법 중 하나인 사전 학습 언어 모델과 근위 정책 최적화 방법을 이용한다.

Language: eng

URI: https://hdl.handle.net/10371/177640

https://dcollection.snu.ac.kr/common/orgView/000000166679

Files in This Item:

000000166679.pdf 4.97 MB

Appears in Collections:

College of Medicine/School of Medicine (의과대학/대학원)
- Dept. of Biomedical Sciences (대학원 의과학과)
  - Theses (Ph.D. / Sc.D._의과학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share