Towards an Effective Low-rank Compression of Neural Networks

어문정

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Towards an Effective Low-rank Compression of Neural Networks : 심층신경망의 효과적인 저 차원 압축

DC Field	Value	Language
dc.contributor.advisor	Wonjong Rhee	-
dc.contributor.author	어문정	-
dc.date.accessioned	2023-11-20T04:41:19Z	-
dc.date.available	2023-11-20T04:41:19Z	-
dc.date.issued	2023	-
dc.identifier.other	000000177526	-
dc.identifier.uri	https://hdl.handle.net/10371/197056	-
dc.identifier.uri	https://dcollection.snu.ac.kr/common/orgView/000000177526	ko_KR
dc.description	학위논문(박사) -- 서울대학교대학원 : 융합과학기술대학원 융합과학부(디지털정보융합전공), 2023. 8. Wonjong Rhee.	-
dc.description.abstract	Compression of neural networks has emerged as one of the essential research topics, especially for edge devices that have limited computation power and storage capacity. The most popular compression methods include quantization, pruning of redundant parameters, knowledge distillation from a large network to a small one, and low-rank compression. The low-rank compression methodology has the potential to be a high-performance compression method, but it does not achieve high performance since it does not solve the challenge of determining the optimal rank of all the layers. This thesis explores two methods to solve the challenge and improve compression performance. First, we propose BSR (Beam-search and Stable Rank), a low-rank compression algorithm that embodies an efficient rank-selection method and a unique compression-friendly training method. For the rank selection, BSR employs a modified beam search that can perform a joint optimization of the rank allocations over all the layers in contrast to the previously used heuristic methods. For compression-friendly training, BSR adopts a regularization loss derived from a modified stable rank, which can control the rank while incurring almost no harm in performance. Experiment results confirm that BSR is effective and superior compared to the existing low-rank compression methods. Second, we propose a fully joint learning framework called LeSS to simultaneously determine filters for filter pruning and ranks for low-rank decomposition. We provided a method for rank selection with a training method and confirmed a significant improvement in performance by integrating it with the existing pruning method, which has outstanding performance. LeSS does not depend on iterative or heuristic processes, and it satisfies the desired resource budget constraint. LeSS comprises two learning modules: mask learning for filter pruning and threshold learning for low-rank decomposition. The first module learns masks identifying the importance of the filters, and the second module learns the threshold of the singular values to be removed such that only significant singular values remain. Because both modules are designed to be differentiable, they are easily combined and jointly optimized. LeSS outperforms state-of-the-art methods on a number of benchmarks, demonstrating its effectiveness. Finally, to obtain high performance in transfer learning for fine-grained datasets, we propose mask learning for both rank and filter selection. The mask learning approach could be employed in transfer learning since it is more crucial to determine which singular values are useful rather than rank selection. Our approach to compression for transfer learning yielded either improved or comparable performance with uncompressed results. We anticipate these techniques will be broadly applicable to industrial domains.	-
dc.description.tableofcontents	Chapter 1. Introduction 1 1.1 Thesis Outline 4 1.2 Related Publications 4 Chapter 2. Background 6 2.1 Compression of Deep Neural Networks 6 2.2 Structured Compression of Deep Neural Networks 8 2.2.1 Low-Rank Compression 9 2.2.2 Filter Pruning 15 2.3 Low-rank decomposition in other fields 17 2.4 Thesis Roadmap 19 Chapter 3. An Effective Low-Rank Compression with a Joint Rank Selection Followed by a Compression-Friendly Training 20 3.1 Introduction 20 3.2 Contributions 24 3.3 Related works 25 3.3.1 Beam search 25 3.3.2 Stable rank and rank regularization 26 3.4 The basics of low-rank compression 28 3.4.1 The basic process 28 3.4.2 Compression ratio 28 3.5 Methodology 29 3.5.1 Overall process 29 3.5.2 Modified beam-search (mBS) for rank selection 32 3.5.3 Modified stable rank (mSR) for regularized training 35 3.6 Experiments 36 3.6.1 Experimental setting 36 3.6.2 Experimental results 38 3.6.3 Analysis of BSR 47 3.7 Discussion 59 3.7.1 Combined use with quantization 59 3.7.2 Limitations and future works 59 3.8 Conclusion 60 Chapter 4. Learning to Select a Structured Architecture over Filter Pruning and Low-rank Decomposition 61 4.1 Introduction 61 4.2 Contribution 66 4.3 Related works 67 4.3.1 Hybrid compression methods 67 4.4 Background 68 4.4.1 Selection problem for DNN compression 68 4.4.2 Tensor Matricization 68 4.4.3 CNN decomposition scheme 69 4.5 Learning framework for the selection problem in hybrid compression 70 4.6 Experiments 79 4.6.1 Experimental settings 79 4.7 Analysis and discussion 85 4.7.1 Learning strategy analysis 85 4.7.2 Influence of matricization scheme 88 4.7.3 Data efficiency of LeSS 88 4.7.4 Extension to higher-order SVD 90 4.7.5 Extension to transformer architecture 90 4.7.6 Discussion on the reasons for the improved performance of compressed models compared to the uncompressed baseline model 91 4.8 Conclusion 92 Chapter 5. Conclusion and limitations 93 Bibliography 94 Appendices 108 A The SoTA compression methods 109 B Resource budget definition 109 C Implementation details 110 C.1 Hyper-parameter setting 110 C.2 Tuning details of hyper-parameters 111 D Full comparison results 111	-
dc.format.extent	ix, 113	-
dc.language.iso	eng	-
dc.publisher	서울대학교 대학원	-
dc.subject	Structured Compression	-
dc.subject	Low-rank Compression	-
dc.subject	Filter Pruning	-
dc.subject	Beamsearch	-
dc.subject	Mask Learning	-
dc.subject.ddc	004	-
dc.title	Towards an Effective Low-rank Compression of Neural Networks	-
dc.title.alternative	심층신경망의 효과적인 저 차원 압축	-
dc.type	Thesis	-
dc.type	Dissertation	-
dc.contributor.AlternativeAuthor	Moonjung Eo	-
dc.contributor.department	융합과학기술대학원 융합과학부(디지털정보융합전공)	-
dc.description.degree	박사	-
dc.date.awarded	2023-08	-
dc.identifier.uci	I804:11032-000000177526	-
dc.identifier.holdings	000000000050▲000000000058▲000000177526▲	-

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Transdisciplinary Studies(융합과학부)
  - Theses (Ph.D. / Sc.D._융합과학부)

Files in This Item:

000000177526.pdf 14.47 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share