Bayesian Deep Meta-Learning via Variational Inference: Applications in Few-Shot and Federated Learning

Abstract: 메타러닝은 기계학습의 하위 영역으로, 새로운 작업에 빠르게 적응하는 알고리즘 개발을 목표로 한다. 이러한 빠른 적응능력은 개별 작업 학습에 중점을 두는 대신, 학습 과정 자체를 학습하는 메타 학습자를 이용하여 구현할 수 있다. 이 메타 학습자는 기계가 새로운 작업에 효율적으로 적응하는 데 활용될 수 있다. 최근에는 메트릭 기반, 최적화 기반, 모델 기반 등 다양한 메타 학습 전략이 도입되어, 퓨샷 회귀, 퓨샷 분류, 주도적 학습, 강화학습 등 다양한 영역에 적용되고 있다. 그러나 현재의 메타 학습 방식, 특히 메타 학습자는 계산 요구량, 확장성, 모델 과적합 등의 한계가 여전히 존재한다.

본 연구에서는 Meta-Variaitonal Dropout (MetaVD)라는 새로운 베이지안 메타 학습 전략을 제안한다. MetaVD는 하이퍼네트워크를 이용하여 각 신경망 가중치에 대한 작업별 드롭아웃 비율을 추정한다. 이를 통해 다중 작업 환경에서의 데이터 효율적 학습과, 새로운 작업을 위한 전역적인 신경망의 빠른 재구성을 가능하게 한다. 이 프레임워크에서는 저차원 근사와 공유된 변분 사전 해석을 이용하여 드롭아웃 사후모델을 정규화하는 등의 새로운 기술들을 논의한다. MetaVD는 다양한 기존 딥 러닝 알고리즘에 적용 가능한 범용적인 접근법을 제공한다. 제안된 방법론은 1차원 회귀, 이미지 인페인팅, 분류를 포함한 다양한 퓨샷 학습 응용 사례에서 높은 적응 및 일반화 성능을 입증하였다.

연합 학습(FL)은 원격에서 분산된 로컬 클라이언트로부터 글로벌 추론 모델을 학습하는 것을 목표로 하는 기계 학습의 연구 분야이며, 데이터 개인정보 보호의 강화 덕분에 많은 주목을 받고 있다. 그러나 현재의 FL 접근법은 실제 시나리오에서의 모델 과적합과 제한된 비 독립 동일 분포 클라이언트 데이터 등의 문제를 가지고 있다. 이러한 문제를 해결하기 위해, MetaVD를 분산 학습 환경에 적용하기 위해 확장하였다. FL에서의 공유 하이퍼네트워크는 서버에 저장되며, 클라이언트별 드롭아웃 비율을 예측하는 방법을 학습한다. 이를 통해 제한된 비 i.i.d. 데이터 설정에서의 FL 알고리즘의 효과적인 모델 개인화를 가능하게 한다. 또한, 사후 집계를 위한 조건부 드롭아웃 사후 분포도 도입하였다. 본 연구에서는 희소하고 독립 동일 분포가 아닌 다양한 FL 데이터셋을 활용하여 광범위한 실험을 수행하였다. MetaVD는 분포 외 클라이언트에 대해 특히 뛰어난 분류 정확도와 불확실성 보정 성능을 보였다. MetaVD는 각 클라이언트에 필요한 로컬 모델 매개변수를 압축함으로써 모델 과적합을 완화하고 통신 비용을 줄인다. 또한 MetaVD는 다중 도메인 데이터셋을 포함한 FL에서 최첨단 성능을 발휘한다.

전반적으로 이 논문은 퓨샷 학습 및 연합 학습 영역에서의 문제를 해결하기 위한 베이지안 메타 학습 접근법을 위한 포괄적인 프레임워크를 다루었다. 조건부 드롭아웃 사후 모델링은 불확실성 추정 및 보정 외에도 효율적인 모델 적응 및 개인화를 가능하게 한다. 실험 결과는제안된 접근 방식의 뛰어난 성능을 보여주었으며, 이는 실제 시나리오에서의 메타 학습과 응용 분야의 발전에 기여한다.
Meta-learning is a subfield of machine learning that aims to develop an algorithm capable of rapid adaptation to new tasks. This quick adaptation of machines can be achieved by leveraging a meta-learner, which learns the learning process rather than focusing on learning individual tasks. Then, the meta-learner can be utilized to adapt machines for new tasks efficiently. Recently, diverse meta-learning approaches have been introduced in this field, including metric-based, optimization-based, and model-based methods, and applied to many applications such as few-shot regression, few-shot classification, active learning, and reinforcement learning. However, the conventional meta-learning approaches, specifically the meta-learner, still have limitations, such as computational demands, scalability, and model over-fitting.

This thesis introduces a new Bayesian meta-learning approach called a Meta-Variaitonal Dropout (MetaVD). MetaVD leverages a hyper-network to approximate conditional dropout rates for each neural network weight. This facilitates quick reconfiguration of global learning and sharing neural networks for new tasks while enabling data-efficient learning in the multi-task environment. Several novel techniques regarding this framework are discussed, including the low-rank approximation for memory-efficient mapping of dropout rates for the entire neural network weights and a new shared variational prior interpretation for regularizing the dropout posterior. MetaVD is a versatile approach that can be applied to a wide range of conventional deep neural network algorithms. The proposed methodology was tested and demonstrated excellent adaptation and generalization performance in various few-shot learning applications, including 1d regression, image inpainting, and classification.

Federated learning (FL) is a research field in machine learning that aims to train a global inference model from remotely distributed local clients, gaining popularity due to its benefit of improving data privacy. However, conventional FL approaches encounter many challenges in practical scenarios, including model overfitting and diverging local models due to the limited and non-i.i.d. data among clients' devices. To address these issues, MetaVD is extended and applied to the distributed environment. In the FL, the shared hypernetwork is kept in the server and is learning to predict client-dependent dropout rates. This allows an effect model personalization of FL algorithms in the limited non-i.i.d. data settings. In addition, the posterior aggregation based on conditional dropout posterior is also introduced. We performed extensive experiments on various sparse and non-i.i.d. FL datasets. MetaVD demonstrated outstanding classification accuracy and generalization performance, particularly for out-of-distribution (OOD) clients. In addition, MetaVD compresses the local model parameters needed for each client, reducing communication costs and improving the calibration of the model prediction.

Overall, we propose a novel Bayesian meta-learning approach that can address many challenges in few-shot learning and federated learning applications. The conditional dropout posterior enables efficient model personalization, uncertainty calibration, and outstanding predictive performance. Experimental results show the excellent performance of the proposed approach. This contributes to the development of meta-learning and application in real scenarios.

Language: eng

URI: https://hdl.handle.net/10371/196469

https://dcollection.snu.ac.kr/common/orgView/000000178479

Files in This Item:

000000178479.pdf 43.06 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Ph.D. / Sc.D._컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share