Publications

Detailed Information

Fast and Reliable Inference Algorithms in Crowdsourcing Systems : 크라우드소싱 시스템에서의 빠르고 신뢰성 높은 추론 알고리즘

DC Field Value Language
dc.contributor.advisor정교민-
dc.contributor.author이동현-
dc.date.accessioned2021-11-30T02:30:10Z-
dc.date.available2021-11-30T02:30:10Z-
dc.date.issued2021-02-
dc.identifier.other000000165499-
dc.identifier.urihttps://hdl.handle.net/10371/175340-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000165499ko_KR
dc.description학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2021. 2. 정교민.-
dc.description.abstractAs the need for large scale labeled data grows in various fields, the appearance of web-based crowdsourcing systems gives a promising solution to exploiting the wisdom of crowds efficiently in a short time with a relatively low budget. Despite their efficiency, crowdsourcing systems have an inherent problem in that responses from workers can be unreliable since workers are low-paid and have low responsibility. Although simple majority voting can be a natural solution, various research studies have sought to aggregate noisy responses to obtain greater reliability in results. In this dissertation, we propose novel iterative massage-passing style algorithms to infer the groundtruths from noisy answers, which can be directly applied to real crowdsourcing systems. While EM-based algorithms get the limelight in crowdsourcing systems due to their useful inference techniques, our proposed algorithms draw faster and more reliable answers through an iterative scheme based on the idea of low-rank matrix approximations. We show that the performance of our proposed iterative algorithms are order-optimal, which outperforms majority voting and EM-based algorithms. Unlike other researches solving simple binary-choice questions (yes & no), our studies cover more complex task types which contain multiple-choice questions, short-answer questions, K-approval voting, and real-valued vector regression.-
dc.description.abstract다양한 분야에서 라벨된 빅데이터를 필요로 하는 현재, 웹 기반 크라우드소싱 서비스들이 출범하며 상대적으로 적은 예산과 짧은 시간에도 효율적으로 사람들의 지혜를 활용할 수 있는 방법들이 제시되고 있다. 이러한 방법들의 효율성에도 불구하고, 크라우드소싱 시스템의 선천적인 문제점은 일을 맡은 사람들의 적은 보상 및 책임감 결여로 인해 그들의 응답을 완전히 신뢰할 수 없다는 점에 있다. 이에 다수결 방식이 자연스러운 해법으로 사용되지만, 보다 신뢰 높은 답을 얻어내기 위해 많은 연구들이 진행되고 있다.
본 박사학위 논문에서는 크라우드소싱 시스템에서 수많은 사람들로부터 받은 응답들을 모아 신뢰성 높은 응답을 추론하는 반복적 메세지전달 형태의 알고리즘들을 제시한다. 본 알고리즘들은 낮은랭크근사에 기반한 반복 추론 방법으로, 기존에 각광받던 EM 알고리즘들에 비해 더 빠르고 신뢰적인 정답을 추론해낸다. 더불어 본 알고리즘들의 추론 정확도가 최적에 매우 근접하며 다수결 방식 및 EM 알고리즘들의 정확도를 상회한다는 것을 이론적 증명 및 실험적 결과를 통해 제시한다. 본 연구는 실제 크라우드소싱에서 대다수의 응답 유형을 차지하는 객관식 응답, 주관식 응답, 복수 선택 응답, 및 실수 값 응답의 추론 문제를 다루며, 기존 양자택일 응답 추론 문제만을 다루는 기존 연구들과 큰 차별성을 가진다.
-
dc.description.tableofcontents1 Introduction 1
2 Background 9
2.1 Crowdsourcing Systems for Binary-choice Questions 9
2.1.1 Majority Voting 10
2.1.2 Expectation Maximization 11
2.1.3 Message Passing 11
3 Crowdsourcing Systems for Multiple-choice Questions 12
3.1 Related Work 13
3.2 Problem Setup 16
3.3 Inference Algorithm 17
3.3.1 Task Allocation 17
3.3.2 Multiple Iterative Algorithm 18
3.3.3 Task Allocation for General Setting 20
3.4 Applications 23
3.5 Analysis of Algorithms 25
3.5.1 Quality of Workers 25
3.5.2 Bound on the Average Error Probability 27
3.5.3 Proof of the Error Bounds 29
3.5.4 Proof of Sub-Gaussianity 32
3.6 Experimental Results 36
3.6.1 Comparison with Other Algorithms 37
3.6.2 Adaptive Scenario 38
3.6.3 Simulations on a Set of Various D Values 41
3.7 Conclusion 43
4 Crowdsourcing Systems for Multiple-choice Questions with K-Approval Voting 45
4.1 Related Work 47
4.2 Problem Setup 49
4.2.1 Problem Definition 49
4.2.2 Worker Model for Various (D, K) 50
4.3 Inference Algorithm 51
4.4 Analysis of Algorithms 53
4.4.1 Worker Model 55
4.4.2 Quality of Workers 56
4.4.3 Bound on the Average Error Probability 58
4.4.4 Proof of the Error Bounds 59
4.4.5 Proof of Sub-Gaussianity 62
4.4.6 Phase Transition 67
4.5 Experimental Results 68
4.5.1 Performance on the Average Error with q and l 68
4.5.2 Relationship between Reliability and y-message 69
4.5.3 Performance on the Average Error with Various (D, K) Pairs 69
4.6 Conclusion 72
5 Crowdsourcing Systems for Real-valued Vector Regression 73
5.1 Related Work 75
5.2 Problem Setup 77
5.3 Inference Algorithm 78
5.3.1 Task Message 79
5.3.2 Worker Message 80
5.4 Analysis of Algorithms 81
5.4.1 Worker Model 81
5.4.2 Oracle Estimator 84
5.4.3 Bound on the Average Error Probability 86
5.5 Experimental Results 91
5.5.1 Real Crowdsourcing Data 91
5.5.2 Verification of the Error Bounds with Synthetic data 96
5.6 Conclusion 98
6 Conclusions 99
-
dc.format.extentviii, 110-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectCrowdsourcing-
dc.subjectMessage-passing style algorithm-
dc.subjectApproximate inference-
dc.subject크라우드소싱-
dc.subject메세지전달 형태 알고리즘-
dc.subject근사 추론-
dc.subject.ddc621.3-
dc.titleFast and Reliable Inference Algorithms in Crowdsourcing Systems-
dc.title.alternative크라우드소싱 시스템에서의 빠르고 신뢰성 높은 추론 알고리즘-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorDonghyeon Lee-
dc.contributor.department공과대학 전기·컴퓨터공학부-
dc.description.degreeDoctor-
dc.date.awarded2021-02-
dc.identifier.uciI804:11032-000000165499-
dc.identifier.holdings000000000044▲000000000050▲000000165499▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share