Publications
Detailed Information
Energy-efficient Neural Network Accelerator Based on Outlier-aware Low-precision Computation
Cited 98 time in
Web of Science
Cited 123 time in Scopus
- Authors
- Issue Date
- 2018-06
- Publisher
- IEEE
- Citation
- 2018 ACM/IEEE 45TH ANNUAL INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE (ISCA), pp.688-698
- Abstract
- Owing to the presence of large values, which we call outliers, conventional methods of quantization fail to achieve significantly low precision, e.g., four bits, for very deep neural networks, such as ResNet-101. In this study, we propose a hardware accelerator, called the outlier-aware accelerator (OLAccel). It performs dense and low-precision computations for a majority of data (weights and activations) while efficiently handling a small number of sparse and high-precision outliers (e.g., amounting to 3% of total data). The OLAccel is based on 4-bit multiply-accumulate (MAC) units and handles outlier weights and activations in a different manner. For outlier weights, it equips SIMD lanes of MAC units with an additional MAC unit, which helps avoid cycle overhead for the majority of outlier occurrences, i.e., a single occurrence in the SIMD lanes. The OLAccel performs computations using outlier activation on dedicated, high-precision MAC units. In order to avoid coherence problem due to updates from low and high-precision computation units, both units update partial sums in a pipelined manner. Our experiments show that the OLAccel can reduce by 43.5% (27.0%), 56.7% (36.3%), and 62.2% (49.5%) energy consumption for AlexNet, VGG-16, and ResNet-18, respectively, compared with a 16-bit (8-bit) state-of-the-art zero-aware accelerator. The energy gain mostly comes from the memory components, the DRAM, and on-chip memory due to reduced precision.
- ISSN
- 1063-6897
- Files in This Item:
- There are no files associated with this item.
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.