Publications

Detailed Information

ELSA: Hardware-software Co-design for efficient, lightweight self-attention mechanism in neural networks

Cited 47 time in Web of Science Cited 53 time in Scopus
Authors

Ham, Tae Jun; Lee, Yejin; Seo, Seong Hoon; Kim, Soosung; Choi, Hyunji; Jung, Sung Jun; Lee, Jae Wook

Issue Date
2021-06
Publisher
IEEE
Citation
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA, Vol.2021-June, pp.692-705
Abstract
© 2021 IEEE.The self-attention mechanism is rapidly emerging as one of the most important key primitives in neural networks (NNs) for its ability to identify the relations within input entities. The self-attention-oriented NN models such as Google Transformer and its variants have established the state-of-the-art on a very wide range of natural language processing tasks, and many other self-attention-oriented models are achieving competitive results in computer vision and recommender systems as well. Unfortunately, despite its great benefits, the self-attention mechanism is an expensive operation whose cost increases quadratically with the number of input entities that it processes, and thus accounts for a significant portion of the inference runtime. Thus, this paper presents ELSA (Efficient, Lightweight Self-Attention), a hardware-software co-designed solution to substantially reduce the runtime as well as energy spent on the self-attention mechanism. Specifically, based on the intuition that not all relations are equal, we devise a novel approximation scheme that significantly reduces the amount of computation by efficiently filtering out relations that are unlikely to affect the final output. With the specialized hardware for this approximate self-attention mechanism, ELSA achieves a geomean speedup of 58.1× as well as over three orders of magnitude improvements in energy efficiency compared to GPU on self-attention computation in modern NN models while maintaining less than 1% loss in the accuracy metric.
ISSN
1063-6897
URI
https://hdl.handle.net/10371/183738
DOI
https://doi.org/10.1109/ISCA52012.2021.00060
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share