Publications

Detailed Information

McDRAM: Low Latency and Energy-Efficient Matrix Computation in DRAM

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

신현승

Advisor
유승주
Major
공과대학 컴퓨터공학부
Issue Date
2018-02
Publisher
서울대학교 대학원
Keywords
Neural NetworkDRAMRNNLSTMMLPMACHBM2
Description
학위논문 (석사)-- 서울대학교 대학원 : 공과대학 컴퓨터공학부, 2018. 2. 유승주.
Abstract
Neural networks are characterized by massively parallel computation and high memory bandwidth. In particular, memory bandwidth severely limits performance and increases power consumption. In order to overcome memory bottleneck of neural network applications, we propose a novel memory architecture called McDRAM where DRAM dies are equipped with a large number of multiplier-accumulator (MAC) units to perform neural networks internally. Each bank of DRAM memory has multiple MACs as much as the size of memory pre-fetch data, thereby fully utilizing internal bandwidth of DRAM which far larger than external memory bandwidth. McDRAM broadcast data efficiently to all bank without any modifications of DRAM data bus, and it performs MAC operations in the all banks with a single DRAM command. McDRAM is implemented based on the state-of-the-art commercial memory architecture, HBM2, and it equips thousands of MACs (up to 6,144 in HBM2) in a single DRAM package. According to our experiments with in-house memory models based on commercial JEDEC HBM2 simulator, McDRAM achieves 18.68x TOPS/W performance compared to the state-of-the-art hardware accelerator (Google TPU) in LSTM.
Language
English
URI
https://hdl.handle.net/10371/141557
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share