Publications

Detailed Information

Construct Validity in Human Scoring and Criterion: What Criterion would(not) measure

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

Koo Jungyeon

Issue Date
2020-09-01
Publisher
서울대학교 인문대학 영어영문학과
Citation
영학논집, Vol.40, pp. 133-166
Keywords
AEShuman-scoringconstruct dimensionvalidityindependent writing
Abstract
This is a pilot study which aims to examine the reliability of automated essay scoring (AES) and to investigate validity of construct that Criterion would/would not measure. Criterion assessed iBT TOEFL independent writing tasks by comparing human raters evaluation. In particular, the current study explored which essay features were most closely related to each of the six different analytic dimensions for e-rater (Criterion). Five types of prompts were employed to assign a writing test to fifty college students in Seoul. The result showed that the agreement between human-rater and Criterion is moderate. In addition, three essay features
(development, organization, and grammar) were crucial factors to predict the holistic score in human rating. Grammar, however, was a powerful predictor to tell the whole score in AES, which reflects that development and organization were not evaluated appropriately in Criterion. This result suggests that the feature dimensions in e-rater need to be refined/revised in the development and organization construct dimensions. The findings have some implications in teaching students process writing
and using AES.
Language
English
URI
https://hdl.handle.net/10371/168960
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share