Refurbish your training data: Reusing partially augmented samples for faster deep neural network training

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Refurbish your training data: Reusing partially augmented samples for faster deep neural network training

Cited 11 time in Web of Science Cited 11 time in Scopus

Authors: Lee, Gyewon; Lee, Irene; Ha, Hyeonmin; Lee, Kyunggeun; Hyun, Hwarim; Shin, Ahnjae; Chun, Byung Gon

Abstract: © 2021 USENIX Annual Technical Conference. All rights reserved.Data augmentation is a widely adopted technique for improving the generalization of deep learning models. It provides additional diversity to the training samples by applying random transformations. Although it is useful, data augmentation often suffers from heavy CPU overhead, which can degrade the training speed. To solve this problem, we propose data refurbishing, a novel sample reuse mechanism that accelerates deep neural network training while preserving model generalization. Instead of considering data augmentation as a black-box operation, data refurbishing splits it into the partial and final augmentation. It reuses partially augmented samples to reduce CPU computation while further transforming them with the final augmentation to preserve the sample diversity obtained by data augmentation. We design and implement a new data loading system, Revamper, to realize data refurbishing. It maximizes the overlap between CPU and deep learning accelerators by keeping the CPU processing time of each training step constant. Our evaluation shows that Revamper can accelerate the training of computer vision models by 1.03×-2.04× while maintaining comparable accuracy.

Appears in Collections:

Show Full Item Record

Find it @ SNU

SNS Share