Publications

Detailed Information

Improving Memory System Performance through Exploiting Asymmetries in Heterogeneous Memories : 이종 메모리의 비대칭성을 이용하는 메모리 시스템 성능 향상 기술

DC Field Value Language
dc.contributor.advisor안정호-
dc.contributor.author노유환-
dc.date.accessioned2018-11-12T00:57:36Z-
dc.date.available2018-11-12T00:57:36Z-
dc.date.issued2018-08-
dc.identifier.other000000152146-
dc.identifier.urihttps://hdl.handle.net/10371/143158-
dc.description학위논문 (박사)-- 서울대학교 대학원 : 융합과학기술대학원 융합과학부(지능형융합시스템전공), 2018. 8. 안정호.-
dc.description.abstractThe memory system in computing systems has a significant impact on application performance. Therefore, it is important to improve memory system performance, which comes from various factors such as intrinsic access latency/bandwidth and microarchitecture of the main memory device (e.g., DRAM), total memory capacity, and memory controller design (including the control method based on device characteristics). First, we focused on DRAM access latency, which often directly affects the application execution time. This is more critical for applications where lack of locality or memory-level parallelism is observed. However, reducing it by generation has been slow compared to enhancing DRAM capacity and bandwidth. Although low-latency DRAM organizations have been introduced, they increase the chip dimensions substantially or require software modification due to access latency non-uniformity. Moreover, most reduce row activation and precharge time rather than column access latency which is more critical. Next, we focused on memory capacity as satisfying the demand for higher memory capacity is a major problem for computing systems. Conventional solutions are reaching those limits, so DRAM/NVM hybrid main memory systems that consist of emerging Non-Volatile Memory (NVM) for large capacity and DRAM last-level cache for high access speed were proposed to make further improvements. However, in these systems, the two device types share limited memory channels (or ranks) where NVM channels (or ranks) are often less utilized than DRAM ones. This causes an imbalance in the use of every channel (or rank), deteriorating memory system performance when an application needs moderate bandwidth. Last, we also focused on the performance of NVM. Phase Change Memory (PCM) is a promising NVM candidate for DRAM/NVM hybrid main memory systems due to its merits of high capacity and low standby power. However, the poor write performance is a critical issue for full adoption as main memory devices. Due to the high write power consumption and high write latency, the PCMs write throughput is severely limited under chip power restriction.

To overcome the three aforementioned limitations, we propose three novel techniques to improve memory system performance. First, a DRAM microarchitectural technique, SOUP (Skewed Organization of µbanks with Pipelined accesses), is proposed to provide uniform low column access time over the entire DRAM chip by leveraging asymmetry in column access latency within a bank due to non-uniform distance to the column decoders. By starting I/O transfers as soon as data from near cells arrive instead of waiting for the entire column data, SOUP saves three memory clock cycles for column accesses to all banks. In our evaluations, SOUP improves IPC and EDP by up to 7.7% and 12.2%, respectively, over the baseline DDR4 device for memory-intensive SPEC CPU2006 workloads while incurring negligible area overhead. Second, for DRAM/NVM hybrid main memory systems, we propose a load balancing technique between DRAM and NVM channels (or ranks) called OBYST (On hit BYpass to STeal bandwidth) which improves memory bandwidth by selectively sending read requests that hit on the DRAM cache to NVM instead of busy DRAM. We also propose an inter-device request scheduling policy optimized to OBYST. With negligible area overhead, OBYST improves bandwidth, IPC, and EDP by up to 22%, 21%, and 26%, respectively, over the baseline without bandwidth optimizations. Last, a PCM write throughput improvement technique called Reset-In-Set is proposed, which enables the PCM to concurrently execute more write requests by reducing the peak power of multi-bit writes. The peak write power reduction is achieved by delaying short 0 writes until the lowest power region of long 1 writes. This technique decreases the average PCM write latency substantially and simulation results show that Reset-In-Set increases the performance of a system with DRAM/PCM hybrid main memory by up to 44% with negligible implementation overhead.

Consequently, we improve DRAM access latency and PCM write throughput by proposing SOUP and Reset-In-Set, respectively. These methods are effective independently for DRAM-only main memory systems and PCM-only ones as well as cooperatively for DRAM/NVM hybrid systems. OBYST also enhances the memory bandwidth of hybrid systems that consist of heterogeneous memory devices.
-
dc.description.tableofcontentsAbstract 1

Introduction 13

1.1 Outline 23

Background 24

2.1 DDR4 DRAM Device Organization 24

2.2 Low-Latency DRAM Organizations 28

2.3 Sharing Channels or Ranks of DRAM and NVM 29

2.4 Baseline DRAM/NVM hybrid memory structures 31

2.5 Performance Degradation by Sharing Channels or Ranks 32

2.6 PCM Cell Write Methods 35

SOUP: Reducing DRAM Column Access Latency 36

3.1 Motivation 36

3.2 Design of SOUP 38

3.2.1 Modifications for the logical row composition 39

3.2.2 Column access example with SOUP 39

3.3 Implementation of SOUP 40

3.4 Discussion 42

3.4.1 Support for critical-word first 42

3.4.2 Other target devices and alternative designs 45

3.4.3 Repairing faulty cells in SOUP 45

3.5 Experimental Methodology 46

3.6 Evaluation 48

OBYST: Increasing the Bandwidth of DRAM/NVM Hybrid Memory Systems 51

4.1 Design of OBYST 52

4.2 Inter-Device Request Scheduling Policy for OBYST 53

4.3 Implementation of OBYST 56

4.4 Experimental Methodology 57

4.5 Evaluation 59

4.5.1 Improvements by OBYST 60

4.5.2 Analysis on limitations of previous work 62

Reset-In-Set: Increasing PCM Write Throughput 65

5.1 Reducing the Peak Power of Multi-bit Writes 65

5.2 Making Rare Case Slow 66

5.3 Implementation of Reset-In-Set 69

5.4 Experimental Methodology 70

5.5 Evaluation 71

5.5.1 Improvements by Reset-In-Set 71

5.5.2 A case study 73

Related Work 75

6.1 Reducing DRAM Latency 75

6.2 Effective DRAM Cache Implementations 77

6.3 Improving PCM-based Main Memory Performance 78

Conclusion 81

Bibliography 85

국문초록 93
-
dc.language.isoen-
dc.publisher서울대학교 대학원-
dc.subject.ddc620.82-
dc.titleImproving Memory System Performance through Exploiting Asymmetries in Heterogeneous Memories-
dc.title.alternative이종 메모리의 비대칭성을 이용하는 메모리 시스템 성능 향상 기술-
dc.typeThesis-
dc.contributor.AlternativeAuthorYuhwan Ro-
dc.description.degreeDoctor-
dc.contributor.affiliation융합과학기술대학원 융합과학부(지능형융합시스템전공)-
dc.date.awarded2018-08-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share