Publications

Detailed Information

Towards an Effective Low-rank Compression of Neural Networks : 심층신경망의 효과적인 저 차원 압축

DC Field Value Language
dc.contributor.advisorWonjong Rhee-
dc.contributor.author어문정-
dc.date.accessioned2023-11-20T04:41:19Z-
dc.date.available2023-11-20T04:41:19Z-
dc.date.issued2023-
dc.identifier.other000000177526-
dc.identifier.urihttps://hdl.handle.net/10371/197056-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000177526ko_KR
dc.description학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 융합과학부(디지털정보융합전공), 2023. 8. Wonjong Rhee.-
dc.description.abstractCompression of neural networks has emerged as one of the essential research topics, especially for edge devices that have limited computation power and storage capacity. The most popular compression methods include quantization, pruning of redundant parameters, knowledge distillation from a large network to a small one, and low-rank compression. The low-rank compression methodology has the potential to be a high-performance compression method, but it does not achieve high performance since it does not solve the challenge of determining the optimal rank of all the layers. This thesis explores two methods to solve the challenge and improve compression performance. First, we propose BSR (Beam-search and Stable Rank), a low-rank compression algorithm that embodies an efficient rank-selection method and a unique compression-friendly training method. For the rank selection, BSR employs a modified beam search that can perform a joint optimization of the rank allocations over all the layers in contrast to the previously used heuristic methods. For compression-friendly training, BSR adopts a regularization loss derived from a modified stable rank, which can control the rank while incurring almost no harm in performance. Experiment results confirm that BSR is effective and superior compared to the existing low-rank compression methods. Second, we propose a fully joint learning framework called LeSS to simultaneously determine filters for filter pruning and ranks for low-rank decomposition. We provided a method for rank selection with a training method and confirmed a significant improvement in performance by integrating it with the existing pruning method, which has outstanding performance. LeSS does not depend on iterative or heuristic processes, and it satisfies the desired resource budget constraint. LeSS comprises two learning modules: mask learning for filter pruning and threshold learning for low-rank decomposition. The first module learns masks identifying the importance of the filters, and the second module learns the threshold of the singular values to be removed such that only significant singular values remain. Because both modules are designed to be differentiable, they are easily combined and jointly optimized. LeSS outperforms state-of-the-art methods on a number of benchmarks, demonstrating its effectiveness. Finally, to obtain high performance in transfer learning for fine-grained datasets, we propose mask learning for both rank and filter selection. The mask learning approach could be employed in transfer learning since it is more crucial to determine which singular values are useful rather than rank selection. Our approach to compression for transfer learning yielded either improved or comparable performance with uncompressed results. We anticipate these techniques will be broadly applicable to industrial domains.-
dc.description.tableofcontentsChapter 1. Introduction 1
1.1 Thesis Outline 4
1.2 Related Publications 4
Chapter 2. Background 6
2.1 Compression of Deep Neural Networks 6
2.2 Structured Compression of Deep Neural Networks 8
2.2.1 Low-Rank Compression 9
2.2.2 Filter Pruning 15
2.3 Low-rank decomposition in other fields 17
2.4 Thesis Roadmap 19
Chapter 3. An Effective Low-Rank Compression with a Joint Rank Selection Followed by a Compression-Friendly Training 20
3.1 Introduction 20
3.2 Contributions 24
3.3 Related works 25
3.3.1 Beam search 25
3.3.2 Stable rank and rank regularization 26
3.4 The basics of low-rank compression 28
3.4.1 The basic process 28
3.4.2 Compression ratio 28
3.5 Methodology 29
3.5.1 Overall process 29
3.5.2 Modified beam-search (mBS) for rank selection 32
3.5.3 Modified stable rank (mSR) for regularized training 35
3.6 Experiments 36
3.6.1 Experimental setting 36
3.6.2 Experimental results 38
3.6.3 Analysis of BSR 47
3.7 Discussion 59
3.7.1 Combined use with quantization 59
3.7.2 Limitations and future works 59
3.8 Conclusion 60
Chapter 4. Learning to Select a Structured Architecture over Filter Pruning and Low-rank Decomposition 61
4.1 Introduction 61
4.2 Contribution 66
4.3 Related works 67
4.3.1 Hybrid compression methods 67
4.4 Background 68
4.4.1 Selection problem for DNN compression 68
4.4.2 Tensor Matricization 68
4.4.3 CNN decomposition scheme 69
4.5 Learning framework for the selection problem in hybrid compression 70
4.6 Experiments 79
4.6.1 Experimental settings 79
4.7 Analysis and discussion 85
4.7.1 Learning strategy analysis 85
4.7.2 Influence of matricization scheme 88
4.7.3 Data efficiency of LeSS 88
4.7.4 Extension to higher-order SVD 90
4.7.5 Extension to transformer architecture 90
4.7.6 Discussion on the reasons for the improved performance of compressed models compared to the uncompressed baseline model 91
4.8 Conclusion 92
Chapter 5. Conclusion and limitations 93
Bibliography 94
Appendices 108
A The SoTA compression methods 109
B Resource budget definition 109
C Implementation details 110
C.1 Hyper-parameter setting 110
C.2 Tuning details of hyper-parameters 111
D Full comparison results 111
-
dc.format.extentix, 113-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectStructured Compression-
dc.subjectLow-rank Compression-
dc.subjectFilter Pruning-
dc.subjectBeamsearch-
dc.subjectMask Learning-
dc.subject.ddc004-
dc.titleTowards an Effective Low-rank Compression of Neural Networks-
dc.title.alternative심층신경망의 효과적인 저 차원 압축-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorMoonjung Eo-
dc.contributor.department융합과학기술대학원 융합과학부(디지털정보융합전공)-
dc.description.degree박사-
dc.date.awarded2023-08-
dc.identifier.uciI804:11032-000000177526-
dc.identifier.holdings000000000050▲000000000058▲000000177526▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share