수학 문제 풀이에 대한 인공지능 모델의 수 개념 이해도에 대한 고찰

안지수

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

수학 문제 풀이에 대한 인공지능 모델의 수 개념 이해도에 대한 고찰 : A Study on the Understanding of the Number Concept of the AI Model for Math Word Problem Solving

DC Field	Value	Language
dc.contributor.advisor	권가진	-
dc.contributor.author	안지수	-
dc.date.accessioned	2023-11-20T04:41:36Z	-
dc.date.available	2023-11-20T04:41:36Z	-
dc.date.issued	2023	-
dc.identifier.other	000000178349	-
dc.identifier.uri	https://hdl.handle.net/10371/197065	-
dc.identifier.uri	https://dcollection.snu.ac.kr/common/orgView/000000178349	ko_KR
dc.description	학위논문(석사) -- 서울대학교대학원 : 융합과학기술대학원 지능정보융합학과, 2023. 8. 권가진.	-
dc.description.abstract	문장형 수학 문제 자동 풀이 연구는 1960년부터 지속적으로 연구되어 온 흥미로운 분야(Wilks, 1976)이다. AI의 발전함에 따라 문장형 수학 문제를 풀기 위해 AI를 사용하려는 시도가 늘고 있다. 그러나 최근 문장형 수학 문제 풀이 모델이 문제를 이해하고 추론을 통해 문제를 푸는 것이 아닌 문제에 등장하는 숫자를 적절히 조합하여 답을 도출한다는 문제가 제기(Patel et al., 2021)됨에 따라 모델의 문장형 수학 문제 이해 여부가 불분명해졌다. 수학 문제를 이해하기 위해서는 문제에 등장하지 않는 문제에 등장하는 숫자의 이해는 선행되어야 하므로, 본 논문에서는 수학 문제 풀이 모델이 문제 풀이 과정에서 숫자의 이해를 돕는 두 가지 연구를 수행한다. Study 1은 명시적 자질 추출 방식을 제안한다. 기존 BERT 계열의 사전학습 언어 모델을 사용한 연구는 수학 문제를 푸는 과정에서 숫자 정보를 제한적으로 사용하기 때문에 수학 문제에 등장하는 숫자의 대소관계를 파악하기 어려웠다. 이 방식은 숫자 토큰을 문제 풀이에 활용할 수 있도록 하는 방식으로, 자연어 이해 계열 모델 중 SVAMP 데이터셋에서 최고 성능을 보이는 deductive reasoner 모델에 적용해 본 결과 최대 2.8%의 성능 향상을 보였다. 위 실험의 결과로 자연어 이해 모델이 수학 문제를 풀 때, 문제에 등장하는 숫자 토큰을 사용하는 것이 대소관계 파악에 도움을 주어 모델의 정답률을 증가시킬 수 있다는 가능성을 확인하였다. Study 2는 GPT 계열 사전학습 언어 모델의 구현체인 GPT-3.5-turbo에서 여러 토큰으로 구성된 숫자의 대소관계를 파악할 때, 자릿수 개념을 완벽하게 이해하지 못하고 있다는 문제를 실험을 통해 보여주고 이를 보완하는 전략을 제안한다. 이 실험의 결과로는 기존 GPT-3.5-turbo를 사용한 프롬프트 전략의 최고 성능 대비 3.1% (in CoT)와 2.06% (in PoT)의 정답률 상승을 보였다. 위 실험의 결과로 자연어 생성 모델이 수학 문제를 풀 때, 숫자의 자릿수 개념을 추가해 준다면 숫자 토큰의 대소관계 파악에 도움을 주어 정답률이 증가할 수 있다는 가능성을 확인하였다.	-
dc.description.abstract	The field of math word problem solving, an intriguing area of study since the 1960s, has been consistently researched. As AI advances, attempts to use AI for solving sentence-type mathematical problems have been increasing. However, recent concerns that the math word problem-solving models do not actually understand and solve the problem through reasoning but derive answers by appropriately combining the numbers appearing in the problem have brought ambiguity to the understanding of these models. To comprehend mathematical problems, understanding the numbers that appear in the problem is a prerequisite, and thus this paper conducts two studies to aid in the understanding of numbers during the problem-solving process. Study 1 proposes an explicit feature extraction method to address the issue arising from the limited use of number tokens in the problem-solving process of pre-trained BERT-series language models. This method aids in understanding the magnitude of the numbers that appear in the problem. Furthermore, the application of this explicit feature extraction method has shown a 2.8% performance improvement in the natural language understanding model among those in the SVAMP dataset, surpassing the performance of the previous best-performing model. These results confirm the potential of using the number tokens that appear in the problem to help comprehend the magnitude, which can subsequently increase the accuracy of the model. Study 2 demonstrates the problem that the implementation of the pre-trained GPT-series language model, gpt3.5-turbo, fails to fully comprehend the concept of digit position when discerning the magnitude of numbers comprised of multiple tokens. It proposes a strategy to supplement this issue. The result of this experiment showed a 2.06% increase in accuracy compared to the best performance of the previous prompt strategy using gpt3.5-turbo. These results confirm the potential that if the concept of digit position is incorporated when natural language generation models solve math word problems, it can aid in understanding the magnitude of number tokens, which can then increase the accuracy rate.	-
dc.description.tableofcontents	제 1 장 서론 1 1.1 연구의 배경 1 1.2 연구의 내용 3 1.2.1 Study 1: 자연어 이해 모델에서 숫자의 이해 3 1.2.2 Study 2: 자연어 생성 모델에서 숫자의 이해 4 제 2 장 관련 연구 6 2.1 트랜스포머 구조를 활용한 사전학습 언어 모델 6 2.1.1 트랜스퍼 러닝 6 2.1.2. 트랜스포머의 구조 7 2.1.3 자연어 이해 모델 9 2.1.4 자연어 생성 모델 10 2.2 문장형 수학 문제 자동풀이 11 2.2.1 자연어 이해 모델을 사용한 문장형 수학 문제 풀이 11 2.2.2 자연어 생성 모델을 사용한 문장형 수학 문제 풀이 14 제 3 장 연구 방법 19 3.1 실험 환경 19 3.1.1 실험 데이터셋 및 구현 세부사항 20 3.2 Study 1: 자연어 이해 모델에서 숫자의 이해 21 3.2.1 명시적 자질 추출 방식의 제안과 활용 방법 21 3.2.2 Elastic transformer 구조 24 3.2.3 Deductive reasoner 모델에서의 명시적 자질 추출 방식 적용 25 3.3 Study 2: 자연어 생성 모델에서 숫자의 이해 26 3.3.1 여러 토큰으로 이루어진 숫자의 정렬 실험 26 3.3.2 숫자 표현의 영어 표현 대체 실험 28 3.3.3 십진수 개념 추가 실험 29 제 4 장 실험 결과 및 분석 31 4.1 Study 1: 자연어 이해 모델에서의 숫자의 이해 31 4.1.1 명시적 자질 추출 방식 적용 실험 31 4.2 Study 2: 자연어 이해 모델에서의 숫자의 이해 32 4.2.1 여러 토큰으로 이루어진 숫자의 정렬 실험 32 4.2.2 숫자 표현의 영어 표현 대체 실험 및 십진수 개념 추가 실험 33 제 5 장 결론 및 한계점 35 ABSTRACT 44	-
dc.format.extent	45	-
dc.language.iso	kor	-
dc.publisher	서울대학교 대학원	-
dc.subject	Number Understanding	-
dc.subject	BERT	-
dc.subject	GPT	-
dc.subject	ChatGPT	-
dc.subject	gpt3.5-turbo	-
dc.subject	Prompt Engineering	-
dc.subject	Chain of thought	-
dc.subject.ddc	006.3	-
dc.title	수학 문제 풀이에 대한 인공지능 모델의 수 개념 이해도에 대한 고찰	-
dc.title.alternative	A Study on the Understanding of the Number Concept of the AI Model for Math Word Problem Solving	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	An, Ji-su	-
dc.contributor.department	융합과학기술대학원 지능정보융합학과	-
dc.description.degree	석사	-
dc.date.awarded	2023-08	-
dc.identifier.uci	I804:11032-000000178349	-
dc.identifier.holdings	000000000050▲000000000058▲000000178349▲	-

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Intelligence and Information (지능정보융합학과)
  - Theses (Master's Degree_지능정보융합학과)

Files in This Item:

000000178349.pdf 9.16 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share