Construct Validity in Human Scoring and Criterion: What Criterion would(not) measure

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Construct Validity in Human Scoring and Criterion: What Criterion would(not) measure

Cited 0 time in Web of Science Cited 0 time in Scopus

Keywords: AES ; human-scoring ; construct dimension ; validity ; independent writing

Abstract: This is a pilot study which aims to examine the reliability of automated essay scoring (AES) and to investigate validity of construct that Criterion would/would not measure. Criterion assessed iBT TOEFL independent writing tasks by comparing human raters evaluation. In particular, the current study explored which essay features were most closely related to each of the six different analytic dimensions for e-rater (Criterion). Five types of prompts were employed to assign a writing test to fifty college students in Seoul. The result showed that the agreement between human-rater and Criterion is moderate. In addition, three essay features
(development, organization, and grammar) were crucial factors to predict the holistic score in human rating. Grammar, however, was a powerful predictor to tell the whole score in AES, which reflects that development and organization were not evaluated appropriately in Criterion. This result suggests that the feature dimensions in e-rater need to be refined/revised in the development and organization construct dimensions. The findings have some implications in teaching students process writing
and using AES.

Files in This Item:

Appears in Collections:

Show Full Item Record

Find it @ SNU

SNS Share