Publications

Detailed Information

Enabling Fine-Grained Spatial Multitasking on Systolic-Array NPUs Using Dataflow Mirroring

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

Choi, Jinwoo; Ha, Yeonan; Lee, Jounghoo; Lee, Sangsu; Lee, Jinho; Jang, Hanhwi; Kim, Youngsok

Issue Date
2023-12
Publisher
IEEE Computer Society
Citation
IEEE Transactions on Computers, Vol.72 No.12, pp.3383-3398
Abstract
Neural Processing Units (NPUs) frequently suffer from low hardware utilization as the efficiency of their systolic arrays heavily depends on the characteristics of a deep neural network (DNN). Spatial multitasking is a promising solution to overcome the low NPU hardware utilization; however, the state-of-the-art spatial-multitasking NPU architecture achieves sub-optimal performance due to its coarse-grained systolic-array distribution and incurs significant implementation costs. In this paper, we propose dataflow-mirroring NPU (DM-NPU), a novel spatial-multitasking NPU architecture supporting fine-grained systolic-array distribution. The key idea of DM-NPU is to reverse the dataflows of co-located DNNs in horizontal and/or vertical directions. DM-NPU can place allocation boundaries between any adjacent processing elements of a systolic array, both horizontally and vertically. We then propose DM-Perf, an accurate systolic-array NPU performance model, to maximize the spatial-multitasking performance of DM-NPU. Utilizing the existing performance models achieves sub-optimal performance as they cannot accurately capture the resource contention caused by spatial multitasking. DM-Perf, on the other hand, exploits the per-layer performance profiles of a DNN to accurately capture the resource contention. Our evaluation using MLPerf DNNs shows that DM-NPU and DM-Perf can greatly improve the performance by up to 35.1% over the state-of-the-art NPU architecture and performance model.
ISSN
0018-9340
URI
https://hdl.handle.net/10371/201062
DOI
https://doi.org/10.1109/TC.2023.3299030
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • College of Engineering
  • Department of Electrical and Computer Engineering
Research Area AI Accelerators, Distributed Deep Learning, Neural Architecture Search

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share