(A) study on hyperparameter optimization strategy utilizing training time in deep neural networks

조형헌

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

(A) study on hyperparameter optimization strategy utilizing training time in deep neural networks : 훈련 시간을 활용한 심층 신경망의 하이퍼파라미터 최적화 전략 연구

DC Field	Value	Language
dc.contributor.advisor	Wonjong Rhee	-
dc.contributor.author	조형헌	-
dc.date.accessioned	2017-07-19T10:58:19Z	-
dc.date.available	2017-07-19T10:58:19Z	-
dc.date.issued	2017-02	-
dc.identifier.other	000000142054	-
dc.identifier.uri	https://hdl.handle.net/10371/133234	-
dc.description	학위논문 (석사)-- 서울대학교 대학원 : 융합과학부, 2017. 2. 이원종.	-
dc.description.abstract	While the need for feature engineering is greatly reduced in deep neural networks (DNN) in contrast to the machine learning (ML), Hyperparameter optimization (HPO) of DNN emerged as an important problem instead. When DNN becomes deeper, the number of hyperparameters and the training time for each hyperparameter vector tends to increase significantly over traditional ML. The HPO algorithms, which are often considered less efficient than manual HPO performed by experts with experiences, are more important in DNN due to the increased complexity of DNN's hyperparameters. This thesis evaluates the existing HPO algorithms in DNN and analyzes the hyperparameter interdependencies from the viewpoints of test error and training time. Spearmint, an existing Bayesian optimization method that updates the prior distribution from history, performed well when five or less hyperparameter involved. Conducting experiments for HPO with seven hyperparameters of MNIST LeNet-5, a convolutional neural network (CNN) shows that the test error distribution by a hyperparameter looks like a U shape, where test error changes abruptly. However, the training time is strongly tied with the number of epochs and the number of neurons in DNN architecture. Hence, HPO strategies utilizing the number of epochs and estimated training time are introduced and investigated in this thesis. A strategy in this work consists of a coarse optimization and a fine optimization that are trains for small epochs and for large epochs, respectively. Using a developed framework which provides traceability, extensibility, and comparability to HPO methods, extended HPO methods are investigated by which apply fine optimization strategy after coarse optimization strategy to any HPO method. Thus, it was found that extended methods can find better performance faster than the original method. This thesis reveals that hyperparameter interdependency affects test error and training time variability in a CNN. And utilizing the training time, which is highly predictable from a hyperparameter vector, shed light on the HPO speed enhancement of DNN.	-
dc.description.tableofcontents	I. Introduction 1 1.1 Background 1 1.1.1 DNN, DL, ML and AI 1 1.1.2 HPO of ML 5 1.2 Research Motivation 7 1.3 Research Objectives 9 II. Related Works 11 2.1 Problem Definition 11 2.2 Manual HPO 13 2.3 Automatic HPO 15 III. Method 23 3.1 Research Questions 23 3.2 Unified HPO Framework 24 3.3 Coarse-Fine Optimization Strategies 27 IV. Experiments 35 4.1 Dataset and Model 35 4.2 Experiment Setup 37 4.3 Experiment Results 40 4.3.1 Existing HPO Algorithms Benchmark 40 4.3.2 Hyperparameter Interdependency 42 4.3.3 Random Coarse-Fine Optimization 57 4.3.4 Bayesian Coarse-Fine Optimization 63 V. Discussions 67 5.1 Linearity between Architecture Hyperparameters and Training Time 67 5.2 Interdependency of Hyperparameters 68 5.3 Reduction of Time to Operation with Coarse-Fine Optimization Algorithm 68 5.4 Limitations 69 VI. Conclusion 71 6.1 Summary 71 6.2 Contributions 72 6.3 Future Work 73 References 75 VII. Appendix. Additional Figures 79	-
dc.format	application/pdf	-
dc.format.extent	10153023 bytes	-
dc.format.medium	application/pdf	-
dc.language.iso	en	-
dc.publisher	서울대학교 대학원	-
dc.subject	하이퍼파라미터 최적화	-
dc.subject	훈련 시간	-
dc.subject	상호 의존성	-
dc.subject	최적화 전략	-
dc.subject	심층신경망	-
dc.subject.ddc	620	-
dc.title	(A) study on hyperparameter optimization strategy utilizing training time in deep neural networks	-
dc.title.alternative	훈련 시간을 활용한 심층 신경망의 하이퍼파라미터 최적화 전략 연구	-
dc.type	Thesis	-
dc.contributor.AlternativeAuthor	CHO HYUNGHUN	-
dc.description.degree	Master	-
dc.citation.pages	xii,94	-
dc.contributor.affiliation	융합과학기술대학원 융합과학부	-
dc.date.awarded	2017-02	-

Appears in Collections:

Graduate School of Convergence Science and Technology (융합과학기술대학원)
- Dept. of Transdisciplinary Studies(융합과학부)
  - Theses (Master's Degree_융합과학부)

Files in This Item:

000000142054.pdf 9.68 MB

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share