Shaking Attention Scores in Pretrained Transformers

김종원

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Shaking Attention Scores in Pretrained Transformers : 트랜스포머의 어텐션 스코어 조작에 관한 연구

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 김종원

Advisor: 이재진

Issue Date: 2023

Publisher: 서울대학교 대학원

Keywords: Transformer ; attention score ; natural language processing ; NLU ; agglutinative language

Description: 학위논문(석사) -- 서울대학교대학원 : 데이터사이언스대학원 데이터사이언스학과, 2023. 2. 이재진.

Abstract: Although Korean has distinctly different features from English, attempts to find a new Transformer model that more closely matches Korean by reflecting them are insufficient. Among the characteristics of the Korean language, we pay special attention to the role of postpositions. Agglutinative languages have more freedom in word order than inflectional languages, such as English, thanks to the postpositions. This study is based on the hypothesis that the current Transformer is challenging to learn the postpositions sufficiently, which play a significant role in agglutinative languages such as Korean. In Korean, the postpositions are paired with the substantives, so paying more attention to the corresponding substantives seems reasonable compared to other tokens in the sentence. However, the current Transformer learning algorithm has many limitations in doing so. Accordingly, it is shown that the performance of the natural language understanding (NLU) task can be improved by deliberatively changing the attention scores between the postpositions and the substantives. In addition, it is hoped that this study will stimulate the research on new learning methods that reflect the characteristics of Korean.
한국어는 영어와 분명히 다른 특성을 갖고 있지만 이를 Transformer에 반영하여 한국어에 보다 부합하는 새로운 모델을 찾는 시도는 그리 충분하지 않다. 본 연구에서는 한국어 특성 중에 특히 조사의 역할에 주목한다. 조사 덕분에 영어와 같은 굴절어에 비해 문장 내 단어 순서의 자유도가 높은 교착어라는 특성을 반영하여 Transformer의 attention score 계산 방법의 변경을 제안한다. 본 연구는 한국어와 같은 교착어에서 매우 중요한 역할을 하는 조사가 현재의 Transformer에서는 충분히 학습되기 어렵다는 가설에 바탕을 둔다. 한국어에서 조사는 해당 체언과 쌍으로 묶이므로 문장 내의 다른 token에 비해 해당 체언을 좀더 attention하는 것이 타당해 보이지만 현재의 Transformer 학습 방법으로는 한계가 많다는 의미이다. 이에 조사-체언 간의 attention score를 인위적으로 변화시킴으로써 NLU(Natural Language Understanding) 관련 자연어 처리 task의 성능을 높일 수 있음을 보인다. 아울러 한글 특성을 반영한 새로운 학습 방법에 관한 연구에 자극이 될 수 있기를 기대한다.

Language: eng

URI: https://hdl.handle.net/10371/193615

https://dcollection.snu.ac.kr/common/orgView/000000174089

Files in This Item:

000000174089.pdf 0.70 MB

Appears in Collections:

Graduate School of Data Science (데이터사이언스 대학원)
- Theses (Master's Degree_데이터사이언스학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share