Publications

Detailed Information

(A) study on hyperparameter optimization strategy utilizing training time in deep neural networks : 훈련 시간을 활용한 심층 신경망의 하이퍼파라미터 최적화 전략 연구

DC Field Value Language
dc.contributor.advisorWonjong Rhee-
dc.contributor.author조형헌-
dc.date.accessioned2017-07-19T10:58:19Z-
dc.date.available2017-07-19T10:58:19Z-
dc.date.issued2017-02-
dc.identifier.other000000142054-
dc.identifier.urihttps://hdl.handle.net/10371/133234-
dc.description학위논문 (석사)-- 서울대학교 대학원 : 융합과학부, 2017. 2. 이원종.-
dc.description.abstractWhile the need for feature engineering is greatly reduced in deep neural networks (DNN) in contrast to the machine learning (ML), Hyperparameter optimization (HPO) of DNN emerged as an important problem instead.
When DNN becomes deeper, the number of hyperparameters and the training time for each hyperparameter vector tends to increase significantly over traditional ML.
The HPO algorithms, which are often considered less efficient than manual HPO performed by experts with experiences, are more important in DNN due to the increased complexity of DNN's hyperparameters.
This thesis evaluates the existing HPO algorithms in DNN and analyzes the hyperparameter interdependencies from the viewpoints of test error and training time.
Spearmint, an existing Bayesian optimization method that updates the prior distribution from history, performed well when five or less hyperparameter involved.
Conducting experiments for HPO with seven hyperparameters of MNIST LeNet-5, a convolutional neural network (CNN) shows that
the test error distribution by a hyperparameter looks like a U shape, where test error changes abruptly.
However, the training time is strongly tied with the number of epochs and the number of neurons in DNN architecture.
Hence, HPO strategies utilizing the number of epochs and estimated training time are introduced and investigated in this thesis.
A strategy in this work consists of a coarse optimization and a fine optimization that are trains for small epochs and for large epochs, respectively.
Using a developed framework which provides traceability, extensibility, and comparability to HPO methods,
extended HPO methods are investigated by which apply fine optimization strategy after coarse optimization strategy to any HPO method.
Thus, it was found that extended methods can find better performance faster than the original method.
This thesis reveals that hyperparameter interdependency affects test error and training time variability in a CNN.
And utilizing the training time, which is highly predictable from a hyperparameter vector, shed light on the HPO speed enhancement of DNN.
-
dc.description.tableofcontentsI. Introduction 1
1.1 Background 1
1.1.1 DNN, DL, ML and AI 1
1.1.2 HPO of ML 5
1.2 Research Motivation 7
1.3 Research Objectives 9
II. Related Works 11
2.1 Problem Definition 11
2.2 Manual HPO 13
2.3 Automatic HPO 15
III. Method 23
3.1 Research Questions 23
3.2 Unified HPO Framework 24
3.3 Coarse-Fine Optimization Strategies 27
IV. Experiments 35
4.1 Dataset and Model 35
4.2 Experiment Setup 37
4.3 Experiment Results 40
4.3.1 Existing HPO Algorithms Benchmark 40
4.3.2 Hyperparameter Interdependency 42
4.3.3 Random Coarse-Fine Optimization 57
4.3.4 Bayesian Coarse-Fine Optimization 63
V. Discussions 67
5.1 Linearity between Architecture Hyperparameters and Training Time 67
5.2 Interdependency of Hyperparameters 68
5.3 Reduction of Time to Operation with Coarse-Fine Optimization Algorithm 68
5.4 Limitations 69
VI. Conclusion 71
6.1 Summary 71
6.2 Contributions 72
6.3 Future Work 73
References 75
VII. Appendix. Additional Figures 79
-
dc.formatapplication/pdf-
dc.format.extent10153023 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subject하이퍼파라미터 최적화-
dc.subject훈련 시간-
dc.subject상호 의존성-
dc.subject최적화 전략-
dc.subject심층신경망-
dc.subject.ddc620-
dc.title(A) study on hyperparameter optimization strategy utilizing training time in deep neural networks-
dc.title.alternative훈련 시간을 활용한 심층 신경망의 하이퍼파라미터 최적화 전략 연구-
dc.typeThesis-
dc.contributor.AlternativeAuthorCHO HYUNGHUN-
dc.description.degreeMaster-
dc.citation.pagesxii,94-
dc.contributor.affiliation융합과학기술대학원 융합과학부-
dc.date.awarded2017-02-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share