Publications

Detailed Information

PID-Comm: A Fast and Flexible Collective Communication Framework for Commodity Processing-in-DIMM Devices

Cited 0 time in Web of Science Cited 4 time in Scopus
Authors

Noh, Si Ung; Hong, Junguk; Lim, Chaemin; Park, Seongyeon; Kim, Jeehyun; Kim, Hanjun; Kim, Youngsok; Lee, Jinho

Issue Date
2024-06
Publisher
Institute of Electrical and Electronics Engineers Inc.
Citation
Conference Proceedings - Annual International Symposium on Computer Architecture, ISCA, pp.245-260
Abstract
Recent dual in-line memory modules (DIMMs) are starting to support processing-in-memory (PIM) by associating their memory banks with processing elements (PEs), allowing applications to overcome the data movement bottleneck by offloading memory-intensive operations to the PEs. Many highly parallel applications have been shown to benefit from these PIM-enabled DIMMs, but further speedup is often limited by the huge overhead of inter-PE collective communication. This mainly comes from the slow CPU-mediated inter-PE communication methods, making it difficult for PIM-enabled DIMMs to accelerate a wider range of applications. Prior studies have tried to alleviate the communication bottleneck, but they lack enough flexibility and performance to be used for a wide range of applications. In this paper, we present PID-Comm, a fast and flexible inter-PE collective communication framework for commodity PIM-enabled DIMMs. The key idea of PID-Comm is to abstract the PEs as a multi-dimensional hypercube and allow multiple instances of inter-PE collective communication between the PEs belonging to certain dimensions of the hypercube. Leveraging this abstraction, PID-Comm first defines eight interPE collective communication patterns that allow applications to easily express their complex communication patterns. Then, PIDComm provides high-performance implementations of the interPE collective communication patterns optimized for the DIMMs. Our evaluation using 16 UPMEM DIMMs and representative parallel algorithms shows that PID-Comm greatly improves the performance by up to 5.19x compared to the existing inter-PE communication implementations. The implementation of PID-Comm is available at https://github.com/AIS-SNU/PID-Comm.
ISSN
1063-6897
URI
https://hdl.handle.net/10371/209128
DOI
https://doi.org/10.1109/ISCA59077.2024.00027
Files in This Item:
There are no files associated with this item.
Appears in Collections:

Related Researcher

  • College of Engineering
  • Department of Electrical and Computer Engineering
Research Area AI Accelerators, Distributed Deep Learning, Neural Architecture Search

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share