Publications

Detailed Information

ParallelFusion: Towards Maximum Utilization of Mobile GPU for DNN Inference

Cited 0 time in Web of Science Cited 6 time in Scopus
Authors

Lee, Jingyu; Liu, Yunxin; Lee, Youngki

Issue Date
2021-06
Publisher
Association for Computing Machinery, Inc
Citation
EMDL 2021 - Proceedings of the 2021 5th International Workshop on Embedded and Mobile Deep Learning, Part of MobiSys 2021, pp.25-30
Abstract
© 2021 ACM.Mobile GPUs are extremely under-utilized for DNN computations across different mobile deep learning frameworks and multiple DNNs with various complexities. We explore the feasibility of batching and it improves the throughput by up to 35%. However, real-time applications in mobile have a limited amount of requests to get a benefit from batching. To tackle the challenge, we present ParallelFusion technique that enables concurrent execution of heterogeneous operators to further utilize the mobile GPU. We implemented ParallelFusion over the MNN framework and evaluated on 6 state-of-the-art DNNs. Our evaluation shows that Parallel Fusion achieves up to 195% to 218% throughput with fused execution of 2 and 3 operators compared to single DNN inference.
URI
https://hdl.handle.net/10371/183743
DOI
https://doi.org/10.1145/3469116.3470014
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share