Publications

Detailed Information

Predicting L2 Writing Proficiency with Computational Indices Based on N-grams

DC Field Value Language
dc.contributor.authorOh, Byung-Doh-
dc.date.accessioned2018-05-09T05:12:01Z-
dc.date.available2018-05-09T05:12:01Z-
dc.date.issued2017-12-31-
dc.identifier.citation외국어교육연구, Vol.21, pp. 1-20-
dc.identifier.issn1229-5892-
dc.identifier.urihttps://hdl.handle.net/10371/139753-
dc.description.abstractLinguistic features that are indicative of higher writing proficiency levels can inform many aspects of language assessment such as scoring rubrics, test items, and automated essay scoring (AES). The recent advancement of computer algorithms that automatically calculate indices based on various linguistic features has made it possible to examine the relationship between linguistic features and writing proficiency on a larger scale. While the ability to use appropriate n-grams – recurring sequences of contiguous words – has been identified as a characteristic differentiating between proficiency levels in the literature, few studies have examined this relationship using computational indices. To this end, this study utilized the Tool for the Automatic Analysis of Lexical Sophistication (TAALES; Kyle & Crossley, 2015) to calculate eight indices based on n-grams from a stratified corpus consisting of 360 argumentative essays written by Korean college-level learners. First, the indices from the training set of 240 essays were used to design a multinomial logistic regression model in order to identify indices that are significant predictors of writing proficiency levels. Subsequently, the regression model was applied to a test set of 120 essays to examine whether the model could be used to predict the proficiency levels of unseen essays. The results revealed that the mean bigram T, mean bigram Delta P, mean bigram-tounigram Delta P, and proportion of 30,000 most frequent trigrams indices were significant predictors of proficiency levels. Furthermore, the regression model based on eight indices correctly classified 52.5% of essays in the test set, demonstrating above-chance level accuracy.-
dc.language.isoen-
dc.publisher서울대학교 외국어교육연구소-
dc.subjectL2 writing proficiency-
dc.subjectn-grams-
dc.subjectphraseology-
dc.subjectcomputational linguistics-
dc.subjectlanguage assessment-
dc.titlePredicting L2 Writing Proficiency with Computational Indices Based on N-grams-
dc.typeSNU Journal-
dc.contributor.AlternativeAuthor오병도-
dc.citation.journaltitle외국어교육연구(Foreign Language Education Research)-
dc.citation.endpage20-
dc.citation.pages1-20-
dc.citation.startpage1-
dc.citation.volume21-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share