비원어민 한국어 유창성 평가 및 피드백을 위한 기초 연구

Abstract: 이 연구는 말하기 평가 영역에서 핵심적인 평가 항목으로 여겨지는
발화 유창성을 비원어민 한국어 맥락에서 자동으로 평가하기 위한 토대
를 다지기 위한 기초 연구이다.
그간 L2 영어를 대상으로 연구되어 온 다양한 유창성 관련 자질들
중 선행 연구에서 핵심적인 자질로 지목되었던 일부 자질들을 분석 대상
으로 선정하였다. 이 자질들이 비원어민 한국어 유창성 평가에도 유효한
지 살펴보기 위해 발화별로 각 자질의 적용 결과를 자동으로 추출할 수
있는 방법을 제안하고자 한다. 특히, 평가자에 의한 유창성 점수에 대하
여 자동 추출된 유창성 자질 값들의 조합이 가지는 설명력을 조사하여
비원어민 한국어 유창성 자동 평가에 어떤 자질들이 얼마나 중요하게 활
용될 수 있는지 살펴본다. 이는 비원어민 한국어 유창성 자동 평가와 더
불어 피드백 제공 가능성을 탐구하는 것이다.
이 논문에서 제안하는 유창성 자질 값 자동 추출 방식에 대하여, 그
결과 값을 수동으로 음향 분석한 결과와 비교하여 신뢰도를 검증했다.
이러한 방식은 외국인 한국어 학습자에게 매우 유용한 유창성 피드백을
줄 수 있다고 판단된다.
이 연구에서는 AIHUB에서 다운로드 가능한 교육용 비원어민
한국어 음성 데이터를 분석 대상으로 선정하였고, 발화 속도(speech
rate), 조음 속도(articulation rate), 발성시간 비율(phonation-time
ratio), 분당 연속 발화 평균 길이(mean length of runs), 분당 휴지
평균 개수(mean number of silent pauses), 분당 휴지 평균 길이(mean
length of silent pauses) 등 유창성 평가에서 가장 중요한 자질들에
대해 자동 추출 방식을 사용하여 결과 값을 추출했다. 대응표본 t검정(paired t-test)과 Pearson 상관 분석(Pearson correlationanalysis)을 통해 수동 음향 분석 방식과 비교하여 이 자동화 방안의
신뢰성을 입증하였다. 또한, 어떤 유창성 특징들이 비원어민 한국어
유창성 평가에서도 얼마나 유의미한 지표로 기능하는지 살펴보기 위해
진행된 두 가지 단계적 다중 선형회귀 분석 결과, 평가자의 유창성 평가
점수를 설명할 때 유창성 척도만 사용한 첫 번째 다중 회귀 모델의
설명력은 R2=.466에 그친 반면, 문단 읽기 발화 데이터라는 점을
고려하여 발음 평가 점수를 추가 독립변수로 설정한 두 번째 다중 회귀
모델의 설명력은 R2=.686로 상승하였다. 더 나아가, 단계적 회귀
모델이 선택한 자질과 그 자질이 회귀 모델에 미치는 영향력을
살펴보았다. 그 결과, 첫 번째 모델에서 발화 속도(Beta=0.759), 분당
휴지 평균 길이(Beta=0.232) 순서로 유창성 평가 점수를 예측하는 데
유의미한 척도로 밝혀졌으며, 이는 발화 속도와 평균 휴지 길이 자질이
유창성에 대한 인간의 청지각과 높은 상관관계를 보임을 밝혔던 선행
연구와 유사한 결과였다. 이 연구에 사용된 음성 데이터가 문단 읽기
태스크였던 점을 감안하여 발음 평가 점수를 독립변수로 포함시킨 두
번째 모델에서는 발음 점수 (Beta=0.565), 발화 속도(Beta=0.408),
분당 평균 휴지 길이(Beta=0.235) 순으로 회귀 모델에 영향을
미친다는 것을 확인할 수 있었다.

주요어 : 발화 유창성, 비원어민 한국어, 유창성 자질, 유창성 자동 측정,
자동 평가 신뢰성 검토, 유창성 피드백
학 번 : 2019-29853
This study serves as a foundational research aiming to automate the
evaluation of speech fluency — a critical assessment criterion in
spoken language evaluation — in the context of non-native Korean
learners, and to provide feedback.
Among the various fluency-related features previously
studied for L2 English, we selected several key fluency features
identified in prior research and included them as subjects of analysis.
To determine whether these features are also applicable to the
evaluation of non-native Korean fluency, we propose a method for
automatically extracting the values of each feature from non-native
learners Korean utterances. By examining the explanatory power
of the combinations of fluency feature values which are automatically
extracted in predicting and explaining the fluency scores assigned by
human evaluators, we explored the importance and utility of these
attributes in the automatic evaluation and feedback of non-native
Korean fluency, thereby laying the groundwork for further research
in this area.
For the automatic extraction method of fluency feature values
proposed in this paper, we validated its reliability by comparing the
automatically extracted values with the results of manual acoustic
analysis. This method is deemed highly useful for providing valuable
fluency feedback to Korean language learners.
In this study, we analyzed the speech data of non-native
Korean learners, which is available from AIHUB. The most important
features in fluency assessment, such as speech rate, articulation rate,
phonation-time ratio, mean length of runs, mean number of silent
pauses per minute, and mean length of silent pauses, were
automatically extracted. Using paired t-tests and Pearson s
correlation analysis, we verified the reliability of our automated
method by comparing it with manual acoustic analysis. Moreover, in
two consecutive multiple linear regression analyses to examine
which fluency features function as significant indicators in the
evaluation of non-native Korean fluency, the explanatory power of
the first multiple regression model, which used only fluency features
to predict and explain the fluency scores of human evaluators, was
R2=.466. However, the explanatory power increased to R2=.686
when the pronunciation evaluation score was added as an
independent variable in the second multiple regression model.
Furthermore, we examined the features selected by the stepwise
regression model and their influence on the regression model. The
results of the first model identified speech rate (Beta=0.759) and
mean length of silent pauses per minute (Beta=0.232) as significant
predictors of fluency scores, which aligns with prior studies
demonstrating a high correlation between the results of our
automated fluency assessment method and human auditory
perception of fluency. In the second model, including the
pronunciation score as an independent variable, we found that
pronunciation score (Beta=0.565), speech rate (Beta=0.408), and
mean length of silent pauses per minute (Beta=0.235) impacted the
regression model in that order.

Keywords : utterance fluency, non-native Korean fluency, fluency
features, automatic fluency measurement, reliability validation of
automatized evaluation, fluency feedback
Student Number : 2019-29853

Language: kor

URI: https://hdl.handle.net/10371/215974

https://dcollection.snu.ac.kr/common/orgView/000000185488

Files in This Item:

000000185488.pdf 0.90 MB

Appears in Collections:

College of Humanities (인문대학)
- Linguistics (언어학과)
  - Theses (Master's Degree_언어학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share