Publications

Detailed Information

Refurbish your training data: Reusing partially augmented samples for faster deep neural network training

Cited 11 time in Web of Science Cited 11 time in Scopus
Authors

Lee, Gyewon; Lee, Irene; Ha, Hyeonmin; Lee, Kyunggeun; Hyun, Hwarim; Shin, Ahnjae; Chun, Byung Gon

Issue Date
2021-01
Publisher
USENIX Association
Citation
2021 USENIX Annual Technical Conference, pp.537-550
Abstract
© 2021 USENIX Annual Technical Conference. All rights reserved.Data augmentation is a widely adopted technique for improving the generalization of deep learning models. It provides additional diversity to the training samples by applying random transformations. Although it is useful, data augmentation often suffers from heavy CPU overhead, which can degrade the training speed. To solve this problem, we propose data refurbishing, a novel sample reuse mechanism that accelerates deep neural network training while preserving model generalization. Instead of considering data augmentation as a black-box operation, data refurbishing splits it into the partial and final augmentation. It reuses partially augmented samples to reduce CPU computation while further transforming them with the final augmentation to preserve the sample diversity obtained by data augmentation. We design and implement a new data loading system, Revamper, to realize data refurbishing. It maximizes the overlap between CPU and deep learning accelerators by keeping the CPU processing time of each training step constant. Our evaluation shows that Revamper can accelerate the training of computer vision models by 1.03×-2.04× while maintaining comparable accuracy.
URI
https://hdl.handle.net/10371/183780
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share