A Corpus-Based Study on Segment Variations in English Speech Produced by Korean Learners
한국인 학습자의 영어 분절음 변이에 관한 코퍼스 기반 연구
- 인문대학 언어학과
- Issue Date
- 서울대학교 대학원
- 학위논문 (박사)-- 서울대학교 대학원 : 언어학과, 2015. 2. 정민화.
- This study investigates segment variations occurring in a large speech corpus produced by Korean learners and examines the effects of linguistic and extra-linguistic factors on the segment variations produced by these learners.
A variation matrix of the learners speech production was generated from manual transcriptions of their speech corpus, and noticeable variations produced by the learners were determined based on the variation matrix. The variation matrix of the native speech was generated from the TIMIT corpus that exhibited various dialectal variations. The most noticeable variations in the learners speech were determined by introducing a method of comparing the variation matrices of both learners and native speakers. The most noticeable variations of the learners included five vocalic (insertion of /ɯ/, substitution of /oʊ/ for /ɔ/, substitution of /oʊ/ for /ɑ/, substitution of /ɑ/ for /ʌ/, and substitution of /æ/ for /ʌ/) and four consonantal (substitution of /s/ for /z/, substitution of /d/ for /ð/, substitution of /b/ for /v/, and substitution of /p/ for /f/) variations. The results demonstrated that the learners exhibited segment variations that differed from those of native speakers. The learners were greatly affected by orthography for vowels, while effects of the native language were observed in the consonantal variations of the learners.
To investigate the linguistic and extra-linguistic factors constraining the segment variations of the learners, the effects of the factors on the most noticeable variations in only the learners speech were analyzed using generalized linear mixed models. The results indicated that the most noticeable variations produced by the learners were affected by the linguistic and extra-linguistic factors. The segmental contexts greatly affected the most noticeable variations of the learners, and the effect of orthography was found in their most noticeable vocalic variations. For the extra-linguistic factors, both learner gender and speech rate affected the most noticeable variations of the learners.
This dissertation describes the first quantitative study of English segment variations produced by Korean learners using a large-scale speech corpus. In addition, the study contributes to investigating the general effects of the linguistic and extra-linguistic factors on the variations of learners. The results of this study can be used in both context-independent and context-dependent modeling of segment variations to improve the performance of non-native speech recognition systems.