FedSup: Teacher-Student Architecture for Federated Learning with Unlabeled Clients

길광연

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

FedSup: Teacher-Student Architecture for Federated Learning with Unlabeled Clients : FedSup: 교사-학생 구조 준지도 연합학습

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 길광연

Advisor: 김형신

Issue Date: 2023

Publisher: 서울대학교 대학원

Keywords: Federated Learning ; Semi-Supervised Learning

Description: 학위논문(석사) -- 서울대학교대학원 : 데이터사이언스대학원 데이터사이언스학과, 2023. 2. 김형신.

Abstract: Federated Learning (FL) is a machine learning paradigm in which multiple heterogeneous clients train local models with their data and only share the parameters to the server to create a centralized model. This paradigm, however, is based upon an unrealistic assumption that every client has fully labeled data readily available for training. Since labeling the data generally requires domain expertise and consistency, which
are difficult to attain in a federated setup, it is more pragmatic to consider a scenario where clients own completely unlabeled data, whereas the server contains a small fraction of labeled data (Labels-At-Server). The methods to exploit unlabeled data at clients are actively being researched, which takes advantage of stochastic augmentations to improve the quality of pseudo-labels. Inspired by recent SSL methods and
knowledge distillation, we propose a Semi-Supervised FL teacher-student architecture FedSup to tackle this problem. To demonstrate its validity, we conduct various experiments on CIFAR-10/CIFAR-100/STL-10 using naive applications of four popular SSL methods to FL and state-of-the-art Semi-Supervised FL methods, FedMatch and FedRGD. On both Independent and identically distributed (IID) and non-IID data, FedSup demonstrates higher accuracy on all three datasets compared to other methods under finetuning. Also, we conduct ablation studies on CIFAR-10 to explore why FedSup works better.
연합 학습(FL)은 여러 클라이언트가 로컬 데이터로 모델을 훈련하고 매개 변수만 서버에 공유하여 중앙 집중식 모델을 만드는 머신 러닝 패러다임이다. 그러나 이 패러다임은 모든 데이터에 레이블이 완전히 지정되어 있다는 비현실적인 가정에
기초한다. 데이터에 레이블을 지정하려면 일반적으로 도메인 전문성과 일관성이 필요한데, 이는 연합 학습에서는 달성하기 어렵다. 그래서, 클라이언트가 레이블이 없는 데이터를 소유하는 반면, 서버에는 레이블이 지정된 데이터(Labels-At-Server)가 포함되어 있는 시나리오를 고려하는 것이 더 실용적이다. 클라이언트에서 레이블이 지정되지 않은 데이터를 활용하는 방법이 활발히 연구되고 있으며, 이는 확률적 데이터 증강을 활용하여 의사 라벨 (pseudo label)의 품질을 향상시킨다. 최근의 SSL 방법론들과 지식 증류에서 영감을 받아, 우리는 이 문제를 해결하기 위해 준지도 연합학습을 위한 교사-학생 아키텍처 FedSup을 제안한다. FedSup의 타당성을 입증하기 위해, 우리는 최근 준지도 연합학습 방법론인 FedMatch, FedRGD와 네 가지의 SSL 방법론을 연합학습에 적용하여 CIFAR-10/CIFAR-100/STL-10에 대한 다양한 실험을 수행한다. 독립 항등 분산(IID) 데이터와 비 IID 데이터 모두에서 FedSup은 미세 조정 중인 다른 방법에 비해 세 가지 데이터 모두에서 더 높은 정확도를 보여준다. 또한, 우리는 FedSup이 잘 작동하는 이유를 탐구하기 위해 CIFAR-10에 대한 절제 연구를 수행하였다.

Language: eng

URI: https://hdl.handle.net/10371/193610

https://dcollection.snu.ac.kr/common/orgView/000000175050

Files in This Item:

000000175050.pdf 2.66 MB

Appears in Collections:

Graduate School of Data Science (데이터사이언스 대학원)
- Theses (Master's Degree_데이터사이언스학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share