Publications

Detailed Information

Identifying Semantically Similar Questions in Social Q&A Communities : 소셜 질의응답 커뮤니티에서 유사한 질문 추출에 관한 연구

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

김범수

Advisor
박진수
Major
경영대학 경영학과
Issue Date
2018-02
Publisher
서울대학교 대학원
Keywords
Q&Aonline communitiescollective intelligencewisdom of crowdslanguage modelingword2vecconvolutional neural networksdeep learning
Description
학위논문 (석사)-- 서울대학교 대학원 : 경영대학 경영학과, 2018. 2. 박진수.
Abstract
SQA communities are an impressive instance of knowledge sharing over the Web. A tremendous number of questions are asked and answered every minute in prospering SQA communities such as Yahoo! Answers, Stack Exchange network, and Quora. However, it could be observed that a large proportion of the new questions are redundant, with a semantically similar counterpart existing in the database. There exist few thorny challenges regarding identifying semantically equivalent questions in SQA communities: (1) semantically similar questions could be rather dissimilar in terms of syntax and lexicon, (2) obtaining reliable training and test datasets is troublesome, (3) the influence of domain- or context-specific languages, and (4) severe class imbalance problem could seriously hamper the identification process. We suggest a data-driven framework that could overcome such challenges and complement existing models. Our work takes multi-disciplinary approach in building the framework, borrowing concepts and techniques from machine learning, natural language processing (NLP), deep learning, information retrieval, and etc. Our final model utilizing Word2Vec and convolutional neural networks for language modeling shows desirable level of performance, test accuracy of 0.975478 and average precision of 0.983501.
Language
English
URI
https://hdl.handle.net/10371/141267
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share