S-Space Language Education Institute (언어교육원) Language Research (어학연구) Language Research (어학연구) Volume 38 Number 1/4 (2002)
고유명사 자동 처리를 위한 전자 데이터베이스의 구축
- Issue Date
- 서울대학교 언어교육원
- 어학연구, Vol.38 No.1, pp. 407-441
- proper noun; electronic lexicon; encyclopedic noun; current noun; suffix form; buttom-up classification; local grammar
- This study aims to present a Korean Electronic Lexicon of Proper Nouns camed DECOR which is conceived to treat the major parts of the unknown words appeared in the automatic text processing. DECOR is made up of two modules: the module of Encyclopedic Nouns DECOR-E and that of Current Nouns DECOR-C. We discussed, in this paper, especially on the organization of DECOR-E. The lexicon DECOR-E contains about 33,000 proper nouns(i.e. encyclopedic nouns) and 2,200 suffix patterns related to these nouns. The suffix patterns are used to formally classify proper nouns, for they explicitly represent specific semantic features of proper nouns. In this way, we obtained 4 classes of proper nouns and each of them is divided into 3 or 4 sub-classes. The lexicon of Current Nouns DECOR-C is being constructed on the basis of that of DECOR-E. We finally discussed on a powerful auxiliary tool for an electronic lexicon of proper nouns, which is named ‘Local Grammar'. Local grammars, also called Finite-State Local Automata, are represented under directed acyclic graphs and allow the automatic analyzer to recognize several transformed sequences which, carrying same information, are constituted of the same lexical words that are not in a same syntagmatic order.