Publications

Detailed Information

FARNN: FPGA-GPU Hybrid Acceleration Platform for Recurrent Neural Networks

Cited 5 time in Web of Science Cited 9 time in Scopus
Authors

Cho, Hyungmin; Lee, Jeesoo; Lee, Jaejin

Issue Date
2022-07-01
Publisher
Institute of Electrical and Electronics Engineers
Citation
IEEE Transactions on Parallel and Distributed Systems, Vol.33 No.7, pp.1725-1738
Abstract
GPU-based platforms provide high computation throughput for large mini-batch deep neural network computations. However, a large batch size may not be ideal for some situations, such as aiming at low latency, training on edge/mobile devices, partial retraining for personalization, and having irregular input sequence lengths. GPU performance suffers from low utilization especially for small-batch recurrent neural network (RNN) applications where sequential computations are required. In this article, we propose a hybrid architecture, called FARNN, which combines a GPU and an FPGA to accelerate RNN computation for small batch sizes. After separating RNN computations into GPU-efficient and GPU-inefficient tasks, we design special FPGA computation units that accelerate the GPU-inefficient RNN tasks. FARNN off-loads the GPU-inefficient tasks to the FPGA. We evaluate FARNN with synthetic RNN layers of various configurations on the Xilinx UltraScale+ FPGA and the NVIDIA P100 GPU in addition to evaluating it with real RNN applications. The evaluation result indicates that FARNN outperforms the P100 GPU platform for RNN training by up to 4.2x with small batch sizes, long input sequences, and many RNN cells per layer.
ISSN
1045-9219
URI
https://hdl.handle.net/10371/200911
DOI
https://doi.org/10.1109/TPDS.2021.3124125
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share