Publications
Detailed Information
PLF-Join: An Efficient MapReduce Algorithm for Vector Similarity Join : PLF-Join: 벡터 유사 조인을 위한 효율적인 맵리듀스 알고리즘
Cited 0 time in
Web of Science
Cited 0 time in Scopus
- Authors
- Advisor
- 이상구
- Major
- 공과대학 전기·컴퓨터공학부
- Issue Date
- 2015-02
- Publisher
- 서울대학교 대학원
- Keywords
- 벡터 유사 조인
- Description
- 학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 2. 이상구.
- Abstract
- Vector similarity join is a problem of finding all pairs of vectors which has a similarity measure that exceeds a given threshold from a set of vectors. Vector similarity join is used in many applications such as near duplication detection in web pages, recommendation, and mining social data. However, it requires O(n^2) complexity where n is the number of vectors. This impractical time complexity makes it hard to utilize Vector similarity join on many real world problems.
Hence, a lot of the Hadoop MapReduce algorithms were proposed to quickly compute Vector similarity join. The state-of-the-art algorithm considers prefix filtering and length filtering methods to reduce the time taken for Vector similarity join operation. To even further reduce this time complexity, we propose a variation of an algorithm that
can be used to reduce the overhead involved in the network I/O cost. Along with a MapReduce algorithm we propose an efficient pre-processing technique which facilitates Vector similarity join calculation.
- Language
- English
- Files in This Item:
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.