Probing the Linguistic Knowledge of BERT based on the Layer-wise Investigation with Affinity Prober

Abstract: Transformer(Vawsani et al., 2017)의 등장 이후 Self Attention 기제를 사용한 다양한 사전학습 언어모델(Pre-trained Language Model)이 제안되었다. 이러한 사전학습 언어 모델은 일반적으로 미세조정(fine-tuning)을 통해 다양한 자연어 처리 문제에서 높은 성능을 보여왔다. 언어학 분야에서는 언어 모델의 내재적 언어 지식을 탐구하기 위해 통사론, 의미론, 언어 습득 등의 이론 및 실험 언어학 접근법을 기반으로 활발히 연구되고 있다. 본 논문은 Jang et al (2022)에서 제안한 언어 지식 탐침 방법론인 Affinity Prober의 사용 범주를 확장시키는 것을 목표로 한다. 이를 위해 self-attention mechanism에서 어텐션 스코어 값을 보존하며 토큰 간의 관계를 해석하는 알고리즘인 ADTRAS 알고리즘 (An Algorithm for Decrypting Token Relationships within Attention Scores)을 제안한다. 본 논문은 ADTRAS 알고리즘을 활용하여 첫 번째 실험에서 GLUE 벤치마크 내의 통사-의미적 기능을 요구하는 6가지 태스크에 각각 훈련된 BERT 모형의 레이어 패턴을 분석한다. 이를 통해 BERT 모형이 토큰 관계의 유의미한 변화를 포착하고, ADTRAS 알고리즘을 활용하여 BERT 어텐션 변화를 기반으로 BERT 모델이 스스로 어휘 범주(Lexical Category)를 활용하여 품사 정보를 학습한다는 실증적인 증거를 제시한다. 또한 어휘 범주를 중심으로 BERT 레이어의 분명한 언어학적 특징을 일반화한다. 두 번째 실험으로는 Affinity Prober를 활용하여 통사적 언어현상에서의 최소쌍 문장을 처리하는 BERT의 특징을 분석한다. 이 실험은 사용된 15가지의 통사적 언어현상이 BERT 모델에서 처리되는 과정을 Affinity Prober를 활용하여 탐구하여 레이어 별 패턴을 분석하는 것을 목적으로 한다. 이러한 실험 결과로 총 네 가지의 패턴이 관찰되었는데, 본 논문은 관찰된 패턴이 각각 유사한 언어현상 별로 묶인다고 주장한다. 첫 번째 패턴은 Passive와 Ellipsis N-bar와 관련된 언어현상들이 주를 이루며, 두 번째 패턴은 Island Effects, 세 번째 패턴은 Movement에서의 Syntactic Constraints에서의 언어현상, 마지막으로 네 번째 패턴에서는 Verb Predicate Types과 논항 구조에서의 언어현상들로 나타난다. 이러한 각 레이어 별 패턴이 ADTRAS 알고리즘에서의 결과와 일치한다는 점에서 본 실험을 통해 도출된 결과를 뒷받침한다. 요약하자면, 본 논문은 ADTRAS 알고리즘을 제안하고, Jang et al (2022)에서 제안한 Affinity Prober를 확장하여 연구에 활용하였다. 이 과정에서 통사적 언어현상의 BERT 레이어 별 패턴을 성공적으로 추출하여 결과를 설명하고자 노력하였다.
This paper presents a comprehensive investigation into the linguistic knowledge embedded within BERT, a pre-trained language model based on the Transformer architecture. We reinforce and expand upon the methodology proposed by Jang et al (2022) by introducing the ADTRAS algorithm (An Algorithm for Decrypting Token Relationships within Attention Scores), which decrypts token relationships within BERT's attention scores to analyze patterns at each layer. Our experiments using ADTRAS algorithm demonstrate that BERT autonomously learns part-of-speech information by leveraging lexical categories. We also provide insights into the general tendencies of BERT's layers when processing content words and function words. Additionally, we introduce the Classification of Sentence Sequencing (CSS) as a Finetuning Strategy, enabling indirect learning from minimal pairs, and leverage the Affinity Prober to examine syntactic linguistic phenomena using the BLiMP dataset. By tracing patterns and clustering similar phenomena, we enhance our understanding of BERT's interpretation of linguistic structures. Furthermore, we establish in detail the attributes of BERT layers related to lexical categories by connecting the general tendencies of the layers generalized by the ADTRAS algorithm with the results obtained through the Affinity Prober. Our study makes several contributions. First, we introduce the ADTRAS algorithm, which enables a comprehensive analysis of BERT's linguistic knowledge. Second, we provide experimental evidence demonstrating BERT's ability to learn part-of-speech information. Third, we offer insights into the tendencies observed in different layers of BERT. Fourth, we propose the CSS Finetuning Strategy, which allows for indirect learning from minimal pairs. Fifth, we successfully cluster syntactic phenomena using the Affinity Prober. Finally, we uncover the general attention tendency of BERT towards lexical categories.

Language: eng

URI: https://hdl.handle.net/10371/197229

https://dcollection.snu.ac.kr/common/orgView/000000179160

Files in This Item:

000000179160.pdf 7.58 MB

Appears in Collections:

College of Humanities (인문대학)
- Linguistics (언어학과)
  - Theses (Master's Degree_언어학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share