Publications

Detailed Information

Mining Real World Tensors via Efficient Tensor Decomposition Methods : 효율적인 텐서 분해 방법을 통한 실세계 텐서 마이닝

DC Field Value Language
dc.contributor.advisor강유-
dc.contributor.author장준기-
dc.date.accessioned2023-06-29T01:59:46Z-
dc.date.available2023-06-29T01:59:46Z-
dc.date.issued2023-
dc.identifier.other000000175069-
dc.identifier.urihttps://hdl.handle.net/10371/193341-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000175069ko_KR
dc.description학위논문(박사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2023. 2. 강유.-
dc.description.abstractMany real-world data can be represented as tensors including vectors (1-order tensor), matrices (2-order tensor), and higher-order tensors. For example, there are stock data, healthcare data, video data, sensor data, and movie rating data represented as tensors. Tensor decomposition has been widely used in applications to analyze real-world tensors. Since knowledge is inherent in real-world tensors, it is crucial to devise effective Tensor decomposition methods. However, existing Tensor decomposition-based methods require high computational costs and space costs. Therefore, it is very challenging to discover hidden information in large-scale tensors without efficient tensor decomposition methods.
In this thesis, I overcome the limitations of previous tensor analysis methods based on tensor decomposition. Since existing tensor decomposition methods require heavy computations involved with large-scale input tensors, it is crucial to avoid the computations to achieve high efficiency. My proposed methods achieve high efficiency by approximating an input tensor and exploiting the approximation result. I devise highly efficient methods for regular and irregular tensors by exploiting the characteristics of real-world tensors and carefully determining the order of computations. Furthermore, I develop a fast and memory-efficient tensor decomposition-based method that analyzes diverse time ranges. Extensive experiments show that the proposed methods achieve higher efficiency than existing methods while having comparable errors. The proposed methods decompose regular and irregular tensors up to 38.4x and 6x faster than existing methods, respectively. In addition, the proposed method analyzes various time range queries up to 171.9x faster than existing methods. Consequently, the proposed methods allow us to explore meaningful knowledge from various real-world tensors efficiently.
-
dc.description.abstract실세계에 존재하는 다양한 다차원 데이터가 텐서로 표현된다. 텐서는 1차원의 벡터, 2차원의 행렬, 3차원 이상의 고차원 텐서를 포함하는 개념이다. 예를 들어, 주식 데이터, 헬스케어 데이터, 동영상 데이터, 센서 데이터, 영화 등급 데이터 등이 텐서로 표현될 수 있다. 실세계 텐서들에는 중요한 지식 및 정보들이 내재 되어있기 때문에 텐서를 분석하는 도구를 개발하는 것은 매우 중요하다. 하지만, 전통적인 텐서 분석 방법들은 대다수의 규모가 매우 큰 실세계 텐서를 분석하는데 있어 많은 시간과 공간 자원을 필요로 하기 때문에, 텐서들로부터 중요한 지식 및 정보들을 탐색하는데 많은 어려움을 겪어왔다.
본 학위 논문에서는 텐서 분해 기반 대규모 실세계 텐서 분석 기술의 한계를 극복하고자 한다. 기존 텐서 분해 방법들은 주어진 대규모 입력 텐서와의 계산이 많기 때문에 이를 줄이는 것이 효율성을 높이는데 매우 중요하다. 따라서, 텐서 데이터의 실세계 구조적 특징을 활용하여 대규모 입력 텐서에 대한 계산량을 줄임으로써 정확도 손실을 최소화하면서도 빠르고 확장성 있는 기법을 제안한다. 또한, 다양한 시간 범위에 숨겨진 지식 정보들을 효율적으로 탐색할 수 있는 텐서 분해 기반 텐서 분석 기술을 제안한다. 본 학위 논문에서 제안하는 방법들이 실세계 텐서를 매우 효율적으로 분석할 수 있다는 것을 실험을 통해 보여준다. 제안하는 방법들은 비슷한 에러를 가지면서도 매우 높은 효율성을 달성한다. 제안하는 방법들은 기존 방법들과 비교하여 주어진 규칙 텐서와 불규칙 텐서를 각각 최대 38.4배, 6배 빠르게 분해한다. 또한, 제안하는 방법은 임의의 시간 범위에 대해 최대 171.9배 빠르게 분석하는 것을 가능하게 해준다. 제안하는 방법들은 다양한 실세계 텐서들로부터 유의미한 지식을 효율적으로 탐구하는 것을 가능하게 한다.
-
dc.description.tableofcontentsChapter 1 Introduction 1
1.1 Contributions 4
1.2 Overall Impact 5
1.3 Thesis Organization 6

Chapter 2 Background 7
2.1 Tensor 7
2.1.1 Tensor Notation 7
2.1.2 Tensor Operation 7
2.2 Tensor Decomposition 9
2.2.1 Tucker Decomposition 9
2.2.2 PARAFAC2 Decomposition 11
2.3 Related Works 14
2.3.1 Tensor Decomposition on Regular Tensors 15
2.3.2 PARAFAC2 Decomposition Irregular Tensors 16
2.3.3 Online Streaming Tensor Decomposition 17
2.3.4 Answering Time Range Queries on Regular Tensors 18

Chapter 3 Efficient Static and Streaming Tensor Decomposition in Regular Tensors 19
3.1 Motivation 19
3.2 Preliminaries 22
3.2.1 Singular Value Decomposition 23
3.2.2 Streaming Tucker Decomposition 24
3.3 Proposed Method for Static Tensors: D-Tucker 25
3.3.1 Overview 26
3.3.2 Approximation Phase 28
3.3.3 Initialization Phase 31
3.3.4 Iteration Phase 37
3.3.5 Lemmas and Theorems 40
3.3.6 Proofs of Lemmas and Theorems 42
3.4 Proposed Method for Online Tensors: D-TuckerO 44
3.4.1 Overview 44
3.4.2 Efficient Update for Time Slice 45
3.4.3 Applying Approximation Phase 50
3.4.4 Theoretical Analysis 53
3.4.5 Proofs of Lemmas and Theorems 53
3.5 Experiment 56
3.5.1 Experimental Settings 57
3.5.2 Time Cost and Reconstruction Error 62
3.5.3 Effectiveness of the Initialization Phase 62
3.5.4 Efficiency of the Iteration Phase 62
3.5.5 Space Cost 64
3.5.6 Scalability 64
3.5.7 Streaming Setting 66
3.5.8 Size of Time Slice 69
3.6 Summary 69

Chapter 4 Efficient Tensor Decomposition in Irregular Tensors 71
4.1 Motivation 71
4.2 Preliminaries 74
4.2.1 Singular Value Decomposition 74
4.3 Proposed Method 76
4.3.1 Overview 76
4.3.2 Compressing an irregular tensor 77
4.3.3 Overview of update rule 81
4.3.4 Finding the factorized matrices of Qk and Yk 81
4.3.5 Updating H, V, and W 83
4.3.6 Careful distribution of work 89
4.3.7 Complexities 90
4.4 Experiments 92
4.4.1 Experimental Settings 93
4.4.2 Performance 95
4.4.3 Data Scalability 97
4.4.4 Multi-core Scalability 99
4.4.5 Discoveries 99
4.5 Summary 103

Chapter 5 Efficient Tensor Decomposition for Diverse Time Ranges in Regular Tensors 105
5.1 Motivation 105
5.2 Problem Definition 109
5.3 Proposed Method 110
5.3.1 Preprocessing Phase 111
5.3.2 Query Phase 114
5.3.3 Analysis 121
5.3.4 Proofs of Lemmas and Theorems 123
5.4 Experiment 127
5.4.1 Experimental Settings 128
5.4.2 Trade-off between Query Time and Reconstruction Error 130
5.4.3 Space Cost 131
5.4.4 Query Cost 131
5.4.5 Effects of Block Size b 133
5.4.6 Discovery 135
5.5 Summary 136

Chapter 6 Future Works 138
6.1 Efficient Online Streaming Method for an Irregular Tensor 138
6.2 Novel Tensor Method with Deep Learning Techniques 139

Chapter 7 Conclusion 140

References 142

Abstract in Korean 157
-
dc.format.extentix, 158-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectTensor Mining-
dc.subjectEfficiency-
dc.subjectTensor Decomposition-
dc.subjectTucker Decomposition-
dc.subjectPARAFAC2 Decomposition-
dc.subjectReal-world Regular Tensors-
dc.subjectReal-world Irregular Tensors-
dc.subject.ddc621.39-
dc.titleMining Real World Tensors via Efficient Tensor Decomposition Methods-
dc.title.alternative효율적인 텐서 분해 방법을 통한 실세계 텐서 마이닝-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorJun-Gi Jang-
dc.contributor.department공과대학 컴퓨터공학부-
dc.description.degree박사-
dc.date.awarded2023-02-
dc.contributor.major컴퓨터공학전공-
dc.identifier.uciI804:11032-000000175069-
dc.identifier.holdings000000000049▲000000000056▲000000175069▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share