S-Space Language Education Institute (언어교육원) Language Research (어학연구) Language Research (어학연구) Volume 41 Number 1/4 (2005)
언어 자료의 통계 분석과 관련된 몇 가지 고려사항들
Some considerations on the analysis of linguistic data based on statistics
- Issue Date
- 서울대학교 언어교육원
- 어학연구, Vol.41 No.3, pp. 655-682
- statistical analysis of text; corpus; collocation; statistical methods; quantitative analysis; corpus linguistics
- Much work has been done on the statistical text analysis. In many cases statistical methods have been applied without any considerations of distributional characteristics of texts. Asymptotic normality assumptions, for example, for some statistical methods have proven to be inappropriate in the case of corpus-based work, especially when rare events make up large fraction of data. This paper deals with the basis of statistics for quantitative analysis of text and suggests that appropriate statistical methods be chosen according to the characteristics of text and linguistic interpretations of statistical results be still required with a view to compensating for statistical limitations. This paper also describes some of widely used statistical methods such as t-test, chi-square, likelihood, and mutual information and points out characteristics of each methods.