Publications

Detailed Information

Created era estimation of old Korean documents via deep neural network

Cited 1 time in Web of Science Cited 0 time in Scopus
Authors

Yoo, Inseon; Kim, Hyuntai

Issue Date
2022-09
Publisher
Springer Science + Business Media
Citation
Heritage Science, Vol.10 No.1, p. 144
Abstract
In general, the created era of a literary work is significant information for understanding the background and the literary interpretation of the work. However, in the case of literary works of old Korea, especially works created in Hangul, there are few works of which the era of creation are known. In this paper, the created era of old Korean documents was estimated based on artificial intelligence. Hangul, a Korean letter system where one syllable is one character, has more than 10,000 combinations of characters, so it is available to predict changes in the structure or grammar of Hangul by analyzing the frequency of characters. Accordingly, a deep neural network model was constructed based on the term frequency of each character in Hangul. Model training was performed based on 496 documents with known publication years, and the mean-absolute-error of the test set for the entire prediction range from 1447 to 1934 was 13.77 years for test sets and 15.8 years for validation sets, which is less than an error ratio of 3.25% compared to the total year range. In addition, the predicted results of works from which only the approximate creation time was inferred were also within the range, and the predicted creation years for other divisions of the identical novel were similar. These results show that the deep neural network model based on character term frequency predicted the creation era of old Korean documents properly. This study is expected to support the literary history of Korea within the period from 15C to 19C by predicting the period of creation or enjoyment of the work. In addition, the method and algorithm using syllable term frequency are believed to have the potential to apply in other language documents.
ISSN
2050-7445
URI
https://hdl.handle.net/10371/185673
DOI
https://doi.org/10.1186/s40494-022-00772-9
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share