Publications

Detailed Information

RBForest: Efficient In-Memory Format for Sparse Matrices on Dynamic Workloads : RBForest: 동적 워크로드 상의 희소행렬을 위한 효율적인 인-메모리 포맷

DC Field Value Language
dc.contributor.advisor문봉기-
dc.contributor.author김원현-
dc.date.accessioned2023-06-29T02:00:10Z-
dc.date.available2023-06-29T02:00:10Z-
dc.date.issued2023-
dc.identifier.other000000174396-
dc.identifier.urihttps://hdl.handle.net/10371/193351-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000174396ko_KR
dc.description학위논문(석사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2023. 2. 문봉기.-
dc.description.abstractSparse matrices only store nonzero entries with their positional information. Thus, while dense matrices typically store all of their entries in a one-dimensional array, sparse matrices have various choices of internal data structures to use. So there can be many different in-memory formats where sparse matrices are managed.
Various sparse matrix formats have been proposed, and they have advantages in different ways. Formats that support fast linear algebra operations cannot handle matrix updates quickly and vice versa. However, several use cases of sparse matrices are highly dynamic. So the sparse matrices are expected to periodically process analytic queries, including linear algebra operations, while quickly handling a stream of matrix updates.
Thus, one of the mainstream strategies is to manage a sparse matrix in an update-friendly format during matrix updates and convert it to a format for fast linear algebra operations when we need to process analytic queries. To make the best use of this strategy, this paper proposes a new sparse matrix format called RBForest. The RBForest manages one red-black tree per row to reduce the tree re-balancing cost after the insertion or deletion of nonzero entries. It also manages a hash table to enable immediate access to its nonzero elements, reducing the initial search cost for all types of matrix updates. Our evaluations show that the RBForest performs more efficiently on real dynamic workloads than previously proposed formats that adhere to the same strategy.
-
dc.description.tableofcontents1 Introduction 1
2 Backgrounds 5
2.1 Matrix Update 5
2.2 Categories of Sparse Matrix Formats 6
2.2.1 Formats for Fast LA Operations 6
2.2.2 Formats for Fast Updates 7
3 Proposed Sparse Matrix Formats 10
3.1 Dictionary of Keys (DOK) 12
3.2 List of Lists (LIL) 12
3.3 Red-Black Tree (RBT) 13
4 RBForest 15
4.1 Design Choices 16
4.1.1 Individual Red-Black Trees for each row 16
4.1.2 Hash Table Connected to Red-Black Trees 17
4.2 Operations 17
4.2.1 Search 18
4.2.2 Insertion 18
4.2.3 Modification 19
4.2.4 Deletion 19
4.2.5 CSR Conversion 20
5 Evaluation 22
5.1 Performance on Individual Operations 22
5.1.1 Insertion 24
5.1.2 Modification 25
5.1.3 CSR Conversion 25
5.1.4 Deletion 25
5.2 Performance on Dynamic Workload 26
5.2.1 Test with Synthetic Matrix 26
5.2.2 Test with Real-World Matrices 28
5.3 Memory Consumption 30
5.4 Additional Remarks on Experimental Results 31
6 Conclusion and Future Works 34
국문초록 39
-
dc.format.extentv, 39-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectSparse Matrix-
dc.subjectMatrix Update-
dc.subjectCSR Format-
dc.subject.ddc621.39-
dc.titleRBForest: Efficient In-Memory Format for Sparse Matrices on Dynamic Workloads-
dc.title.alternativeRBForest: 동적 워크로드 상의 희소행렬을 위한 효율적인 인-메모리 포맷-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorWonhyeon Kim-
dc.contributor.department공과대학 컴퓨터공학부-
dc.description.degree석사-
dc.date.awarded2023-02-
dc.identifier.uciI804:11032-000000174396-
dc.identifier.holdings000000000049▲000000000056▲000000174396▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share