Sentiment Analysis of Online Reviews based on Genre-specific Discourse Patterns : 장르 특정적 담화 유형 기반의 온라인 리뷰의 감정분석

Cited 0 time in Web of Science Cited 0 time in Scopus

Otmakhova Yulia

인문대학 언어학과
Issue Date
서울대학교 대학원
sentiment analysisopinion miningonline reviewsproduct reviewsdiscourse analysis
학위논문 (석사)-- 서울대학교 대학원 : 언어학과 언어학전공, 2015. 8. 신효필.
Though in recent years sentiment analysis has evolved from simple lexicon-based and statistical models to methods involving discourse information, the major problem with the current approaches is that they use the same set of features for sentiment classification of texts of all genres and types (tweets, editorials, discussion board posts, online reviews etc.). Moreover, features that were used by previous researchers reflect only one aspect of discourse, namely, coherence, and they are limited to explicit ways of ensuring coherence, such as conjunctions. To be more specific, these are such features as implicit coherence, realized through adjacency of two sentences, continuity, which shows that two sentences have the same sentiment and is commonly reflected through the use of such conjunctions as and or moreover, and contrast, which is indicated by such conjunctions as but and shows the shift of the opinions polarity.
In this study we propose a new set of features which reflects the specific traits of a particular genre ? online reviews: implicit contrast, realized through usage of such limiting expressions as the only drawback
background patterns, which are expressions that help to establish a review authors identity
and involvement features, which are used to interact with the reader.
To show the effectiveness of these features, we annotated a corpus of 120 product reviews and represented each review as a set of non-discourse, generic and genre-specific discourse features extracted from it (together with the target label from the annotation). Such feature sets were used in two series of experiments: fine-grained and coarse grained. At the sentence level we conducted the experiments with and without lexical features, while at the document level we performed 5-, 3- and 2-class classification. Our experiments showed that genre-specific features in general perform better than the generic ones, ensuring greater improvements in precision and recall. If generic features led to minor increases or even deteriorated the performance (as in case of implicit coherence), genre-specific features (especially background) were more stable and allowed us to achieve better recall and precision across all experiments. These tendencies were especially remarkable in the fine-grained classification with lexical features, where adding generic discourse features to the lexical ones deteriorated the results. Moreover, the performance of genre-specific features is not only statistically reliable but also reflects the theoretical properties of online reviews discourse outlined in our study.
Files in This Item:
Appears in Collections:
College of Humanities (인문대학)Linguistics (언어학과)Theses (Master's Degree_언어학과)
  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.