Evenly Angle Dispersing Methods for Convolutional Kernel Regularization

배정우

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Evenly Angle Dispersing Methods for Convolutional Kernel Regularization : 합성곱 커널 정규화를 위한 고른 각도분산방법

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 배정우

Advisor: 강명주

Issue Date: 2022

Publisher: 서울대학교 대학원

Keywords: DeepLearning ; Convolution ; Kernel ; Regularization ; Orthogonality ; EvenlyDispersedState ; 딥러닝 ; 합성곱 ; 커널 ; 정규화 ; 직교화 ; 고른분산상태

Description: 학위논문(박사) -- 서울대학교대학원 : 자연과학대학 수리과학부, 2022. 8. 강명주.

Abstract: In this thesis, we propose new convolutional kernel regularization methods. Along with the development of deep learning, there have been attempts to effectively regularize a convolutional layer, which is an important basic module of deep neural networks. Convolutional neural networks (CNN) are excellent at abstracting input data, but deepening causes gradient vanishing or explosion issues and produces redundant features. An approach to solve these issues is to directly regularize convolutional kernel weights of CNN. Its basic idea is to convert a convolutional kernel weight into a matrix and make the row or column vectors of the matrix orthogonal. However, this approach has some shortcomings. Firstly, it requires appropriate manipulation because overcomlete issue occurs when the number of vectors is larger than the dimension of vectors. As a method to deal with this issue, we define the concept of evenly dispersed state and propose PH0 and MST regularizations using this. Secondly, prior regularizations which enforce the Gram matrix of a matrix to be an identity matrix might not be an optimal approach for orthogonality of the matrix. We point out that these rather reduces the update of angles between some two vectors when two vectors are adjacent. Therefore, to complement for this issue, we propose EADK and EADC regularizations which update directly the angle. Through various experiments, we demonstrate that EADK and EADC regularizations outperform prior methods in some neural network architectures and, in particular, EADK has fast learning time.
이 논문에서는 합성곱커널에 대한 새로운 정규화 방법들을 제안한다. 딥러닝의 발달과 더불어 신경망의 가장 기본적인 모듈인 합성곱 레이어를 효과적으로 정규화 하려는 시도들이 있어 왔다. 합성곱신경망는 인풋데이터를 추상화하는데 탁월하지만 네트워크의 깊이가 깊어지면 그레디언트 소멸이나 폭발 문제를 일으키고 중복된 피쳐들을 만든다. 이러한 문제들을 해결하기 위한 접근법 중 하나는 직접 합성곱 신경망의 합성곱커널을 직접 정규화 하는 것이다. 이 방법은 합성곱커널을 어떤 행렬로 변환하고 행렬의 행 또는 열들의 벡터들을 직교시키는 것이다. 그러나 이러한 접근법은 몇가지 단점이 있다. 첫째로, 벡터의 수가 벡터의 차원보다 많을 때는 모든 벡터를 직교화 시킬 수 없게 되므로 적절한 기법들을 필요로 한다. 이 문제를 다루기 위한 한 가지 방법으로 우리는 분산 상태라는 개념을 정의하고 이 개념을 활용한 PH0와 MST 정규화법을 제안한다. 둘째로, 그람행렬을 항등행렬로 근사시키는 방법을 사용하는 기존 정규화법이 벡터들을 직교화시키는 최적의 방법이 아닐 수 있다는 점이다. 즉, 기존의 정규화법이 두 벡터가 가까울 때는 오히려 각도의 업데이트를 줄이게 된다.따라서 이를 보완하기 위하여 우리는 각도를 직접 업데이트하는 EADK와 EADC 정규화법을 제안한다. 그리고 다양한 실험을 통해 EADK와 EADC 정규화법이 다수의 신경망구조에서 기존의 방법들보다 우수한 성능을 보이고 특히 EADK는 빠른 학습시간을 가진다는 것을 확인한다.

Language: eng

URI: https://hdl.handle.net/10371/188575

https://dcollection.snu.ac.kr/common/orgView/000000172854

Files in This Item:

000000172854.pdf 5.32 MB

Appears in Collections:

College of Natural Sciences (자연과학대학)
- Dept. of Mathematical Sciences (수리과학부)
  - Theses (Ph.D. / Sc.D._수리과학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share