Publications

Detailed Information

PLF-Join: An Efficient MapReduce Algorithm for Vector Similarity Join : PLF-Join: 벡터 유사 조인을 위한 효율적인 맵리듀스 알고리즘

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

김현준

Advisor
이상구
Major
공과대학 전기·컴퓨터공학부
Issue Date
2015-02
Publisher
서울대학교 대학원
Keywords
벡터 유사 조인
Description
학위논문 (석사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2015. 2. 이상구.
Abstract
Vector similarity join is a problem of finding all pairs of vectors which has a similarity measure that exceeds a given threshold from a set of vectors. Vector similarity join is used in many applications such as near duplication detection in web pages, recommendation, and mining social data. However, it requires O(n^2) complexity where n is the number of vectors. This impractical time complexity makes it hard to utilize Vector similarity join on many real world problems.
Hence, a lot of the Hadoop MapReduce algorithms were proposed to quickly compute Vector similarity join. The state-of-the-art algorithm considers prefix filtering and length filtering methods to reduce the time taken for Vector similarity join operation. To even further reduce this time complexity, we propose a variation of an algorithm that
can be used to reduce the overhead involved in the network I/O cost. Along with a MapReduce algorithm we propose an efficient pre-processing technique which facilitates Vector similarity join calculation.
Language
English
URI
https://hdl.handle.net/10371/123152
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share