Publications

Detailed Information

Identifying Semantically Similar Questions in Social Q&A Communities : 소셜 질의응답 커뮤니티에서 유사한 질문 추출에 관한 연구

DC Field Value Language
dc.contributor.advisor박진수-
dc.contributor.author김범수-
dc.date.accessioned2018-05-29T03:02:24Z-
dc.date.available2018-05-29T03:02:24Z-
dc.date.issued2018-02-
dc.identifier.other000000149620-
dc.identifier.urihttps://hdl.handle.net/10371/141267-
dc.description학위논문 (석사)-- 서울대학교 대학원 : 경영대학 경영학과, 2018. 2. 박진수.-
dc.description.abstractSQA communities are an impressive instance of knowledge sharing over the Web. A tremendous number of questions are asked and answered every minute in prospering SQA communities such as Yahoo! Answers, Stack Exchange network, and Quora. However, it could be observed that a large proportion of the new questions are redundant, with a semantically similar counterpart existing in the database. There exist few thorny challenges regarding identifying semantically equivalent questions in SQA communities: (1) semantically similar questions could be rather dissimilar in terms of syntax and lexicon, (2) obtaining reliable training and test datasets is troublesome, (3) the influence of domain- or context-specific languages, and (4) severe class imbalance problem could seriously hamper the identification process. We suggest a data-driven framework that could overcome such challenges and complement existing models. Our work takes multi-disciplinary approach in building the framework, borrowing concepts and techniques from machine learning, natural language processing (NLP), deep learning, information retrieval, and etc. Our final model utilizing Word2Vec and convolutional neural networks for language modeling shows desirable level of performance, test accuracy of 0.975478 and average precision of 0.983501.-
dc.description.tableofcontents1. Introduction 1
2. Related Works 6
3. Methodology 17
3.1 Data Collection & Preprocessing 18
3.2 Language Modeling 22
3.3 Identification (Classification) 25
3.4 Model Selection & Evaluation 29
4. Results 32
4.1 Initial Attempt 32
4.2 Revised Approach 35
5. Conclusion 36

References 40
Appendix 1. Visualization of Model 47
Appendix 2. Grid Search Results 48
Appendix 3. Random Search Results 51
-
dc.formatapplication/pdf-
dc.format.extent1114490 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subjectQ&A-
dc.subjectonline communities-
dc.subjectcollective intelligence-
dc.subjectwisdom of crowds-
dc.subjectlanguage modeling-
dc.subjectword2vec-
dc.subjectconvolutional neural networks-
dc.subjectdeep learning-
dc.subject.ddc658-
dc.titleIdentifying Semantically Similar Questions in Social Q&A Communities-
dc.title.alternative소셜 질의응답 커뮤니티에서 유사한 질문 추출에 관한 연구-
dc.typeThesis-
dc.description.degreeMaster-
dc.contributor.affiliation경영대학 경영학과-
dc.date.awarded2018-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share