Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Bitwidth-Adaptive Quantization-Aware Neural Network Training: A Meta-Learning Approach

Cited 5 time in Web of Science Cited 4 time in Scopus

Abstract: Deep neural network quantization with adaptive bitwidths has gained increasing attention due to the ease of model deployment on various platforms with different resource budgets. In this paper, we propose a meta-learning approach to achieve this goal. Specifically, we propose MEBQAT, a simple yet effective way of bitwidth-adaptive quantization-aware training (QAT) where meta-learning is effectively combined with QAT by redefining meta-learning tasks to incorporate bitwidths. After being deployed on a platform, MEBQAT allows the (meta-)trained model to be quantized to any candidate bitwidth with minimal inference accuracy drop. Moreover, in a few-shot learning scenario, MEBQAT can also adapt a model to any bitwidth as well as any unseen target classes by adding conventional optimization or metric-based meta-learning. We design variants of MEBQAT to support both (1) a bitwidth-adaptive quantization scenario and (2) a new few-shot learning scenario where both quantization bitwidths and target classes are jointly adapted. Our experiments show that merging bitwidths into meta-learning tasks results in remarkable performance improvement: 98.7% less storage cost compared to bitwidth-dedicated QAT and 94.7% less back propagation compared to bitwidth-adaptive QAT in bitwidth-only adaptation scenarios, while improving classification accuracy by up to 63.6% compared to vanilla meta-learning in bitwidth-class joint adaptation scenarios.