Improving Memory System Performance through Exploiting Asymmetries in Heterogeneous Memories : 이종 메모리의 비대칭성을 이용하는 메모리 시스템 성능 향상 기술

Cited 0 time in Web of Science Cited 0 time in Scopus
융합과학기술대학원 융합과학부(지능형융합시스템전공)
Issue Date
서울대학교 대학원
학위논문 (박사)-- 서울대학교 대학원 : 융합과학기술대학원 융합과학부(지능형융합시스템전공), 2018. 8. 안정호.
The memory system in computing systems has a significant impact on application performance. Therefore, it is important to improve memory system performance, which comes from various factors such as intrinsic access latency/bandwidth and microarchitecture of the main memory device (e.g., DRAM), total memory capacity, and memory controller design (including the control method based on device characteristics). First, we focused on DRAM access latency, which often directly affects the application execution time. This is more critical for applications where lack of locality or memory-level parallelism is observed. However, reducing it by generation has been slow compared to enhancing DRAM capacity and bandwidth. Although low-latency DRAM organizations have been introduced, they increase the chip dimensions substantially or require software modification due to access latency non-uniformity. Moreover, most reduce row activation and precharge time rather than column access latency which is more critical. Next, we focused on memory capacity as satisfying the demand for higher memory capacity is a major problem for computing systems. Conventional solutions are reaching those limits, so DRAM/NVM hybrid main memory systems that consist of emerging Non-Volatile Memory (NVM) for large capacity and DRAM last-level cache for high access speed were proposed to make further improvements. However, in these systems, the two device types share limited memory channels (or ranks) where NVM channels (or ranks) are often less utilized than DRAM ones. This causes an imbalance in the use of every channel (or rank), deteriorating memory system performance when an application needs moderate bandwidth. Last, we also focused on the performance of NVM. Phase Change Memory (PCM) is a promising NVM candidate for DRAM/NVM hybrid main memory systems due to its merits of high capacity and low standby power. However, the poor write performance is a critical issue for full adoption as main memory devices. Due to the high write power consumption and high write latency, the PCMs write throughput is severely limited under chip power restriction.

To overcome the three aforementioned limitations, we propose three novel techniques to improve memory system performance. First, a DRAM microarchitectural technique, SOUP (Skewed Organization of µbanks with Pipelined accesses), is proposed to provide uniform low column access time over the entire DRAM chip by leveraging asymmetry in column access latency within a bank due to non-uniform distance to the column decoders. By starting I/O transfers as soon as data from near cells arrive instead of waiting for the entire column data, SOUP saves three memory clock cycles for column accesses to all banks. In our evaluations, SOUP improves IPC and EDP by up to 7.7% and 12.2%, respectively, over the baseline DDR4 device for memory-intensive SPEC CPU2006 workloads while incurring negligible area overhead. Second, for DRAM/NVM hybrid main memory systems, we propose a load balancing technique between DRAM and NVM channels (or ranks) called OBYST (On hit BYpass to STeal bandwidth) which improves memory bandwidth by selectively sending read requests that hit on the DRAM cache to NVM instead of busy DRAM. We also propose an inter-device request scheduling policy optimized to OBYST. With negligible area overhead, OBYST improves bandwidth, IPC, and EDP by up to 22%, 21%, and 26%, respectively, over the baseline without bandwidth optimizations. Last, a PCM write throughput improvement technique called Reset-In-Set is proposed, which enables the PCM to concurrently execute more write requests by reducing the peak power of multi-bit writes. The peak write power reduction is achieved by delaying short 0 writes until the lowest power region of long 1 writes. This technique decreases the average PCM write latency substantially and simulation results show that Reset-In-Set increases the performance of a system with DRAM/PCM hybrid main memory by up to 44% with negligible implementation overhead.

Consequently, we improve DRAM access latency and PCM write throughput by proposing SOUP and Reset-In-Set, respectively. These methods are effective independently for DRAM-only main memory systems and PCM-only ones as well as cooperatively for DRAM/NVM hybrid systems. OBYST also enhances the memory bandwidth of hybrid systems that consist of heterogeneous memory devices.
Files in This Item:
Appears in Collections:
Graduate School of Convergence Science and Technology (융합과학기술대학원)Dept. of Transdisciplinary Studies(융합과학부)Theses (Ph.D. / Sc.D._융합과학부)
  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.