Automatic Pronunciation Assessment of Korean Spoken by L2 Learners Using Best Feature Set Selection
- Ryu, Hyuksu; Hong, Hyejin; Kim, Sunhee; Chung, Minhwa
- Issue Date
- Asia-Pacific Signal and Information Processing Association (APSIPA)
the Institute of Electrical and Electronis Engineers, Inc.
- 2016 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA), Jeju, 2016, pp. 1-6
- computer aided instruction; feature selection; linguistics; natural languages; principal component analysis; regression analysis; speech recognition; BSS; Chinese language; English language; Japanese language; Korean language learner; L2 Korean Speech Corpus; L2 learners; Mongolian language; PCR; Russian language; automatic pronunciation assessment; best feature set selection; feature selection; learner speech forced-alignment; learner speech recognition; multiple linear regression; native Korean acoustic model; native language; principal component regression; pronunciation score; salient features; speech segment; spoken Korean language; Acoustics; Computational modeling; Correlation; Feature extraction; Manuals; Speech; Speech recognition
- This paper proposes a method for automatic pronunciation assessment of Korean spoken by L2 learners by selecting the best feature set from a collection of the most well-known features in the literature. The L2 Korean Speech Corpus is used for assessment modeling, where the native languages of the L2 learners are English, Chinese, Japanese, Russian, and Mongolian. In our system, learners speech is forced-aligned and recognized using a native Korean acoustic model. Based on these results, various features for pronunciation assessment are computed, and divided into four categories such as RATE, SEGMENT, SILENCE, and GOP. Pronunciation scores produced by combining categories of features by multiple linear regression are used as a baseline. In order to enhance the baseline performance, relevant features are selected by using Principal Component Regression (PCR) and Best Subset Selection (BSS), respectively. The results show that the BSS model outperforms the baseline and the PCR model, and that features corresponding to speech segment and rate are selected as the relevant ones for automatic pronunciation assessment. The observed tendency of salient features will be useful for further improvement of automatic pronunciation assessment model for Korean language learners.