Estimating the Helpfulness of Product Reviews based on Review Information Types : 리뷰 정보 유형에 기반한 상품평 유용성 평가
- 인문대학 언어학과
- Issue Date
- 서울대학교 대학원
- review helpfulness estimation ; review information types ; latent dirichlet allocation ; topic-based approach ; product review evaluation
- 학위논문 (박사)-- 서울대학교 대학원 : 언어학과, 2016. 8. 신효필.
- The sheer number of product reviews for any given product makes it impossible for potential customers to locate those reviews that will be helpful to them. This results in the need to automatically estimate the helpfulness of product reviews such that customers may locate the most helpful ones as quickly and easily as possible. Researchers have explored multiple ways of evaluating review helpfulness, but have mainly focused on how reviews deliver information, i.e., the length, sentiment aspect, readability, etc. However, we make the assumption that it is more important to consider what information reviews deliver to customers than how that information is delivered.
Therefore, this study investigates a way of extracting what information reviews deliver to estimate the helpfulness of those reviews.
To extract information that reviews contain, we categorized the review information types (RIT) for each sentence. When considering the information target, information can be divided into background information about the reviewers previous experience or expertise, core information about the product, peripheral information about non-product information, such as shipping or packaging, and none-relevant information. Overall information contains final purchasing decision, summary and recommendations.
Once the information type of each sentence is categorized, every sentence is converted into a topic dimension vector with the Latent Dirichlet Allocation. For each type of information, topic-based vectors are clustered to find similar-information holding clusters. Then, these clusters are used to extract what information each sentence delivers for sentences in product review test data.
The product reviews are collected for an e-book reader, outdoor tent, and jeans from Amazon.com. For each product domain, 200 reviews are chosen for training and testing for various experiments. The helpfulness score for reviews and review information type for each sentence are manually annotated for this study.
To begin with, we present to what extent it is possible to correctly predict the information type of each sentence through various classification experiments. The review information type of each sentence is predicted based on various features: such as bag-of-words, the position of the sentence in a review, and the form and part-of-speech tag for main subject, verb, and auxiliaries.
A preliminary experiment was conducted to foresee the possibility of using background information to predict the helpfulness of product reviews. This experiment result indicates that our approach with only background information performs as effectively as the features from previous studies.
The final experiments are to mainly show the effect of extracting what information is delivered compared with that of extracting how information is delivered on estimating the helpfulness of product reviews. Through various experiments, we proved that our approach of extracting what information is delivered can more accurately estimate the helpfulness of reviews than features related with how information is delivered.