Publications

Detailed Information

Architecting Main Memory Systems of Manycore Processors : 매니코어 프로세서 시스템을 위한 주 메모리 시스템 아키텍처 설계

DC Field Value Language
dc.contributor.advisor안정호-
dc.contributor.author오성일-
dc.date.accessioned2017-07-14T01:50:42Z-
dc.date.available2017-07-14T01:50:42Z-
dc.date.issued2015-08-
dc.identifier.other000000063207-
dc.identifier.urihttps://hdl.handle.net/10371/122396-
dc.description학위논문 (박사)-- 서울대학교 융합과학기술대학원 : 지능형융합시스템전공, 2015. 8. 안정호.-
dc.description.abstractManycore processors have already become mainstream, where DRAM is widely used as main memory for these manycore processor systems. Applications have also been parallelized to exploit manycore systems efficiently, and their data sets keep increasing. Therefore, main memory systems become the performance and energy bottleneck of modern manycore systems. Through-silicon interposer (TSI) technology is a promising solution to architect high bandwidth energy-efficient main memory systems for modern manycore processors. While TSI improves the I/O energy efficiency, it results in an unbalanced memory system design because DRAM core dominates the overall energy consumption of manycore systems. However, there are few studies on DRAM device microarchitecture that consider the system-level impact on the performance and energy efficiency of manycore systems.
To conduct research on modern manycore systems, we need a cycle-level timing simulator that provides the detailed microarchitecture models of core and uncore subsystems. The core subsystems of manycore processors can consist of traditional or asymmetric cores. The uncore subsystems become more powerful and complex than ever, including deeper cache hierarchies, advanced on-chip interconnects, memory controllers, and main memory. We first implement a new cycle-level timing simulator, McSimA+, which enables microarchitectural studies on manycore systems and have the detailed microarchitecture models of core and uncore subsystems. McSimA+ is an application-level+ simulator, which enjoys the light weight of application-level simulators and the full control of threads and processes as in full-system simulators.
Then, we evaluate the system-level impacts on the performance and power of DRAM array organizations. We model modern DRAM array organizations by varying the number of banks, DRAM row size per bank, and evaluate the area, power, and timing of them. The modeling results show that larger DRAM row improves area efficiency and access time, but increases activation/precharge energy. We evaluate the system-level impacts of DRAM array organizations by simulating a manycore system with 3D stacked DRAM memory. The system performance and energy efficiency improve as each DRAM rank has more banks. While the 8KB DRAM row shows the best performance, the highest energy-delay product (EDP) is obtained when the DRAM row size is 2KB.
We finally propose a new TSI-based main memory system which solves the unbalance between I/O energy and DRAM core energy. Our TSI-based main memory system utilizes a novel DRAM device microarchitecture, called μbank. The μbank partitions each conventional bank into a large number of smaller banks (or μbanks) that operate independently with minimal area overhead. A massive number of μbanks provide ample bank-level parallelism, less bank conflict rate, and thus improve both IPC and EDP by 1.62× and 4.80× respectively for memory intensive SPEC 2006 benchmarks on average over the baseline DDR3-based memory system. We also show that the μbank-based memory systems can simplify memory controller designs because they show comparable performance with simple open-page policy to complex prediction-based page management policies. In our μbank-based memory system, the simple open-page policy achieves more than 95% of the performance of a perfect predictor.
-
dc.description.tableofcontentsAbstract i
Contents v
List of Figures ix
List of Tables xi
1 Introduction ..................................................................................... 1
1.1 Research Contributions ............................................................ 6
1.2 Outline ....................................................................................... 9
2 Main Memory System Organizations .............................................. 10
2.1 DRAM Microarchitecture ....................................................... 16
2.2 DRAM Modules or DRAM Microarchitectures for Manycore Systems ........................................................................................... 20
2.2.1 New DRAM Modules for Manycore Systems ................. 21
2.2.2 New DRAM Microarchitectures for Manycore Systems 23
2.3 Memory Controller Organizations ......................................... 27
2.3.1 Memory Access Scheduling Policies .............................. 28
2.3.1 Page Management Policies .............................................. 32
3 Manycore Simulation Infrastructure ............................................. 35
3.1 Why Yet Another Simulator? ................................................. 39
3.2 McSimA+: Overview and Operation ...................................... 43
3.2.1 Thread Management for Application-level+ Simulation ...................................................................................................... 46
3.2.2 Implementing the Thread Management Layer ................ 48
3.3 Modeling of Manycore Architecture ...................................... 50
3.3.1 Modeling of Core Subsystem ........................................... 51
3.3.2 Modeling of Cache and Coherence Hardware ................. 55
3.3.3 Modeling of Network-on-Chips (NoCs) ....................... 60
3.3.4 Modeling of the Memory Controller and Main Memory . 61
3.4 Validation ................................................................................ 62
3.5 Clustering Effect in Asymmetric Manycore Processors ...... 69
3.5.1 Manycore with Asymmetric Within or Between Clusters 69
3.5.2 Evaluation ......................................................................... 72
3.6 Limitations and Scope of McSimA+ ...................................... 77
4 Energy-Efficient DRAM Array Organizations ............................. 79
4.1 Energy-Efficient DRAM Array Organizations ......................... 81
4.2 Evaluation ............................................................................... 86
4.2.1 Experimental Setup .......................................................... 86
4.2.2 The System-level Impact of DRAM Array Organizations 87
5 Silicon Interposer-Based Main Memory Systems ....................... 90
5.1 Through-Silicon Interposer (TSI) ....................................... 94
5.1.1 TSI Technology Overview............................................... 94
5.1.2 The Energy Efficiency and Latency Impact of the TSI . 97
5.2 Microbank: A DRAM Device Organization for TSI-based Main Memory Systems. ................................................................ 100
5.2.1 Motivation for Microbank ............................................... 100
5.2.2 Microbank Overhead ...................................................... 105
5.3 Revisiting DRAM Page Management Policies .................. 108
5.4 Evaluation ............................................................................. 110
5.4.1 Experimental Setup ........................................................ 110
5.4.2 Test Workloads .............................................................. 112
5.4.3 The System-level Impact of the μbanks .................... 114
5.4.4 The Impact of Address Interleaving and Prediction Based Page-Management Schemes on μbanks ................................ 118
5.4.5 The Impact of OS Page Migration ................................. 121
5.4.6 The Impact of DRAM Refresh ....................................... 122
5.4.7 The Impact of TSI on Processor-Memory Interfaces 124
5.5 Related Work ........................................................................ 126
6 Conclusion ................................................................................... 130
6.1 Future Work ......................................................................... 132
Bibliography ...................................................................................... 135
국문초록 ............................................................................................. 149
감사의 글 ........................................................................................... 153
-
dc.formatapplication/pdf-
dc.format.extent2549766 bytes-
dc.format.mediumapplication/pdf-
dc.language.isoen-
dc.publisher서울대학교 융합과학기술대학원-
dc.subjectmanycore-
dc.subjectmemory system-
dc.subjectDRAM-
dc.subjectμbanks-
dc.subjectmanycore simulator-
dc.subjectThrough-Silicon Interposer (TSI)-
dc.subject.ddc620-
dc.titleArchitecting Main Memory Systems of Manycore Processors-
dc.title.alternative매니코어 프로세서 시스템을 위한 주 메모리 시스템 아키텍처 설계-
dc.typeThesis-
dc.description.degreeDoctor-
dc.citation.pages148-
dc.contributor.affiliation융합과학기술대학원 지능형융합시스템학과-
dc.date.awarded2015-08-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share