Effects of Korean English Teachers Perceived Criterion Importance on Scoring Behavior in L2 Writing Assessment

이영주

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Effects of Korean English Teachers Perceived Criterion Importance on Scoring Behavior in L2 Writing Assessment : 한국인 영어 교사의 채점 기준에 대한 중요성 인식이 영작문 채점 행동에 미치는 영향 분석

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 이영주

Advisor: 소영순

Issue Date: 2023

Publisher: 서울대학교 대학원

Keywords: 채점자 오류 ; 채점 기준에 대한 중요성 ; 영어 쓰기 평가 ; 채점자 유형 ; 채점자 인식 ; 수행 평가

Description: 학위논문(석사) -- 서울대학교대학원 : 사범대학 외국어교육과(영어전공), 2023. 8. 소영순.

Abstract: 의사소통 중심 교수법(Communicative Language Teaching)의 도입으로 학습자의 실제 영어 사용 능력을 평가하는 수행평가에 대한 중요성이 강조되어왔다. 수행평가가 기존의 선다형 평가와 구별되는 점은 학습자가 구성한 답안을 채점하기위해 사용하는 채점기준표(rating rubric)의 존재함에 있다. 다시 말해서, 어떻게 채점자가 채점기준표를 해석하고 적용하는지가 학습자가 받게 될 점수에 지대한 영향을 끼친다. 따라서 채점자와 채점기준(rating criteria) 사이의 상호작용의 속성을 연구한 많은 선행 연구가 있었고, 이들 연구는 채점자의 비일관적인 채점 기준의 적용을 줄임으로써 보다 타당도 높은 평가가 이루어지게 하는데 도움을 주기 위한 목적이 있었다. 채점자 오류(rater effects)를 연구한 기존의 연구는 채점자가 어떠한 채점 기준에 대하여 채점의 엄격성 혹은 관대함을 나타내는지를 기술적으로 증명하였다. 그러나 채점자 오류가 일어나는 근본적인 원인을 인지적인 측면에서 규명하려는 시도가 부족하였다. 즉, 채점자의 채점 기준에 대한 인식이 채점의 엄격성 혹은 관대함에 어떠한 영향을 미치는지를 알아보는 것이 필요하다.
따라서 본 연구는 영어 쓰기 평가 채점 기준에 대한 중요성 인식이 실제 채점에 어떠한 영향을 미치는지를 탐색하는 것이다. 이를 통해 채점 기준에 대한 채점자의 인식을 고찰하고, 채점 기준에 대한 편향 없는 인식을 갖는데 도움을 주기 위함이다.
본 연구를 위해 한국의 중학교 혹은 고등학교에 근무하는 한국인 영어 교사 30명이 참여하였다. 이들은 다섯 가지 채점 기준(Content, Organization, Vocabulary, Language use, Mechanics)에 부여하는 중요성의 정도에 관한 인식을 묻는 설문조사에 참여하고, 30개의 영어 작문을 채점하였다. 다국면라쉬모형과 계층적 군집 분석을 사용하여 채점 기준에 대한 중요성 인식과 채점 기준에 대한 오류를 바탕으로 채점자 인지 유형(Cognitive Rater Types: CRTs)과 채점자 행동 유형(Operational Rater Types: ORTs)을 구성하였다. 이후 두 채점자 유형 사이에 어떠한 관련성이 있는지를 분석하였다.
연구 결과 채점 기준에 대한 중요성 인식에 따라 5가지 채점자 인지 유형이 형성되었고, 채점 기준에 대한 채점자 오류에 따라 6가지 채점자 행동 유형이 구성되었다. 채점자 행동 유형은 상이한 채점 기준에 관한 중요성 인식을 가진 채점자들로 구성이 되었기에 채점자 인지 유형과 채점자 행동 유형 사이에 직접적인 비교가 가능하지 않았다. 따라서 같은 채점자 행동 유형에 속하는 채점자들의 채점 기준에 대한 중요도의 평균 점수와 해당 채점 기준에 보인 편향 수치를 비교한 결과 채점 기준에 대한 중요성 인식이 채점 행동에 미치는 영향은 채점 기준 별로 차이가 있음을 발견하였다. Content와 Mechanics에서 채점의 엄격성과 관대함이 모두 발견되었는데, 채점자 오류가 채점 기준에 대한 중요성 인식과 결합하는 패턴은 이 두 가지 채점 기준에서 차이가 있었다. Content에서 채점의 엄격성은 평균보다 높은 채점 기준 중요도와 결합되어 나타났고, 채점의 관대함은 평균보다 낮은 채점 기준 중요도와 결합되어 관찰되었다. 그러나 이 결합의 패턴은 Mechanics에서 반대로 나타났는데, 채점의 엄격성은 평균보다 낮은 채점 기준 중요도와 결합되어 나타났고, 채점의 관대함은 평균보다 높은 채점 기준 중요도와 결합되어 관찰되었다. 이렇듯, Content와 Mechanics에서 채점 기준 중요도와 채점자 오류의 상이한 결합 패턴은 개별 채점자를 대상으로 한 데이터에서도 관찰되었다.
본 연구는 쓰기 평가를 위해 훈련된 참여자를 대상으로 하지 않은 한계가 있으나 쓰기 평가 채점 기준에 관한 중요성 인식과 채점자 행동의 관계를 파악함으로써 채점자 오류 연구에 관한 지평을 넓히는데 도움이 될 것으로 기대한다.
The advent of Communicative Language Teaching has placed an emphasis on performance-based assessments to assess the ability to use a language. What distinguishes performance-based assessments from multiple-choice questions is the presence of the rating rubric. That is, how raters perceive and apply the rating scale plays a significant role in the evaluative process. Therefore, there has been a wide body of research investigating the interaction between raters and rating criteria, which aimed to enhance the validity of the performance tests by reducing the inconsistency of rating on the part of raters. Previous studies examining rater effects descriptively analyzed the rating criteria to which raters displayed more severity or leniency. However, few attempts have been made to understand the reason behind rater idiosyncrasy from a cognitive perspective. Hence, it is worthwhile to investigate how rater perception of the rating criteria can affect scoring behavior.
The purpose of the present study is to examine how perceived criterion importance can influence scoring behavior. Exploring the relation between rater perception of the rating criteria and scoring profiles will contribute to a better understanding of rater cognition, which in turn can help raters to be equipped with a more balanced view of rating criteria.
For this study, thirty Korean English teachers working at middle and high schools participated in the survey in which they were to indicate the importance of five rating criteria, Content, Organization, Vocabulary, Language use, and Mechanics. Participants also rated thirty writing compositions chosen from YELC (Yonsei English Learners Corpus). Employing Many-facet Rasch measurement and Hierarchical Clustering, two types of raters were formed: Cognitive Rater Types, which were based on perceived criterion importance, and Operational Rater Types, which were derived from criterion-related biases. These two rater types were compared to analyze the relationship between rater perception and rating behavior.
The finding was that five Cognitive Rater Types (CRTs) and six Operational Rater Types (ORTs) were created. As all ORTs were composed of raters from different CRTs, it was not possible to investigate direct relationships between CRTs and ORTs. Therefore, based on the analysis of the mean bias measure in relation to the mean criterion importance rating from raters who belonged to the same ORT, but came from different CRTs, it was found that the effect of criterion importance on scoring behavior varied depending on the rating criteria involved. In Content and Mechanics, both severity and leniency bias were observed, but how biases were combined with criterion importance varied between these two rating criteria. In Content, severity bias was shown to be combined with the criterion importance higher than the average criterion importance whereas leniency bias was shown to be aligned with the criterion importance lower than the average criterion importance. However, in Mechanics, this pattern was reversed, revealing that severity bias was combined with less importance than the average criterion importance while leniency bias was aligned with higher importance than the average criterion importance. The disparity between Content and Mechanics was also identified in the analysis of the data from individual raters.
Although the study has a limitation in that participants were not trained raters for English writing assessments, the studys endeavor to connect rater cognition to scoring behavior will help to expand the scope of the study researching rater effects.

Language: eng

URI: https://hdl.handle.net/10371/196866

https://dcollection.snu.ac.kr/common/orgView/000000178511

Files in This Item:

000000178511.pdf 1.29 MB

Appears in Collections:

College of Education (사범대학)
- Dept. of Foreign Language Education (외국어교육과)
  - English Language (영어전공)
    - Theses (Master's Degree_영어전공)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share