Bayesian Neural Bandit Using Online SWAG

Abstract: In this paper, we propose a Neural SWAG Bandit algorithm that combines a neural network-based bandit algorithm with Stochastic Weight Averaging Gaussian (SWAG), a Bayesian deep learning methodology. Neural Bandit is a bandit algorithm that uses the output of neural networks as an estimated reward. SWAG is a Bayesian Deep Learning method that samples parameters from the gaussian posterior distribution, which has been shown to have state-of-the-art performance and robustness compared to benchmark algorithms. By adapting SWAG into an online setting and combining it with Neural Bandit, we can leverage efficient sampling from deep neural networks while learning online. Our experiment results indicate that Neural SWAG Bandit benefits from Bayesian deep learning as well as exhibits superior performance compared to existing benchmark algorithms.

URI: https://hdl.handle.net/10371/187967

https://dcollection.snu.ac.kr/common/orgView/000000173190

Files in This Item:

Appears in Collections:

Show Full Item Record

Find it @ SNU

SNS Share