Publications

Detailed Information

FlexReduce: Flexible All-reduce for Distributed Deep Learning on Asymmetric Network Topology

Cited 5 time in Web of Science Cited 9 time in Scopus
Authors

Lee, Jinho; Hwang, Inseok; Shah, Soham; Cho, Minsik

Issue Date
2020
Publisher
IEEE
Citation
PROCEEDINGS OF THE 2020 57TH ACM/EDAC/IEEE DESIGN AUTOMATION CONFERENCE (DAC)
Abstract
We propose FlexReduce, an efficient and flexible all-reduce algorithm for distributed deep learning under irregular network hierarchies. With ever-growing deep neural networks, distributed learning over multiple nodes is becoming imperative for expedited training. There are several approaches leveraging the symmetric network structure to optimize the performance over different hierarchy levels of the network. However, the assumption of symmetric network does not always hold, especially in shared cloud environments. By allocating an uneven portion of gradients to each learner (GPU), FlexReduce outperforms conventional algorithms on asymmetric network structures, and still performs even or better on symmetric networks.
ISSN
0738-100X
URI
https://hdl.handle.net/10371/200510
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • College of Engineering
  • Department of Electrical and Computer Engineering
Research Area AI Accelerators, Distributed Deep Learning, Neural Architecture Search

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share