Publications

Detailed Information

Optimizing Memory Management Systems for High Performance and Scalability : 높은 성능과 확장성을 위한 메모리 관리 시스템 최적화

DC Field Value Language
dc.contributor.advisor염헌영-
dc.contributor.author박성재-
dc.date.accessioned2019-10-21T02:27:33Z-
dc.date.available2019-10-21T02:27:33Z-
dc.date.issued2019-08-
dc.identifier.other000000156341-
dc.identifier.urihttps://hdl.handle.net/10371/162025-
dc.identifier.urihttp://dcollection.snu.ac.kr/common/orgView/000000156341ko_KR
dc.description학위논문(박사)--서울대학교 대학원 :공과대학 컴퓨터공학부,2019. 8. 염헌영.-
dc.description.abstractOne common characteristic of modern workloads which appeared with recent computing paradigms including cloud, big data and machine learning is memory intensiveness. Such workloads usually have huge working sets that cannot be fully accommodated in DRAM in many case. Those also tend to show only low locality so that the small CPU cache cannot hide DRAM or lower level memory access overhead.

Meanwhile, computing hardware has also evolved to keep pace with this change. (1) Computing systems are increasing the size of their main memory so that those could accommodate more of the huge working sets. As a result, data center servers utilizing few hundreds of gigabytes of DRAM have been common and even terabytes of DRAM equipped systems exist. (2) Massive parallelism is becomming common and essential. CPU vendors have started to increase the number of CPU cores instead of the CPU frequency due to the heat dissipation and power consumption problem since the early 2000s. Prevalent datacenter systems provide few hundreds of CPU cores; Few thousands of CPU cores are not rare. Such many-core systems are normally
constructed in non-uniform memory access (NUMA) architecture. Therefore, efficient, effective and NUMA-awared use of this parallelism is especially important for the memory intensive workloads.

Compared to these rapid changes of workload characteristics and hardware, memory management system software has not sufficiently optimized. Consequently, the memory management system software has been a bottleneck. In other words, the memory intensive modern workloads cannot fully utilize the evolved modern hardware unless the underlying memory management system is completely optimized. This paper provides an overview of a few limitations in existing memory management systems and introduces two optimization approaches for high performance and scalability of the memory management systems. The first approach improves the performance of the memory systems by guaranteeing huge page utilization under memory fragmentation situation. For the guarantee, we introduce a contiguous memory allocator that guarantees success and low latency of its allocations. The second approach intends to optimize the NUMA-aware system scalability. For that, we optimize virtual memory address space management system by substituting virtual memory area (VMA) managing red-black tree protection from global reader-writer locking to an RCU extension. Because no RCU extension including state-of-the-arts are NUMA oblivious, we also designed new RCU extension that provides NUMA-aware scalable update-side synchronization.
-
dc.description.tableofcontentsAbstract 1
Chapter 1 Introduction 6
1.1 Motivation 6
1.2 Approaches 7
1.2.1 An Optimization for High Performance 7
1.2.2 An Optimization for High Scalability 9
1.3 Dissertation Structure 10
Chapter 2 Guaranteed Transparent Huge Pages Allocations 12
2.1 Introduction 12
2.2 Background 16
2.2.1 Devices using DMA 16
2.2.2 Huge Pages 17
2.2.3 Buddy Allocator 20
2.2.4 Memory Reservation 21
2.2.5 Contiguous Memory Allocator 21
2.3 Guaranteed CMA 22
2.3.1 Secondary Class Clients of GCMA 23
2.3.2 Limitations and Optimizations 26
2.4 Implementation 27
2.4.1 Contiguous Memory Allocation 29
2.4.2 DMEM: Discardable Memory 30
2.5 Guaranteed THP 30
2.6 Evaluation 32
2.6.1 Evaluation on a Mobile System 32
2.6.2 Evaluation on a Server System 38
2.7 Related Work 45
2.8 Conclusion 47
Chapter 3 A Scalable Virtual Address Space Protected by an HTM-based NUMA-aware RCU Extension 48
3.1 Introduction 48
3.2 Background and Related Work 50
3.2.1 Read-Copy Update 50
3.2.2 Hardware Transactional Memory 53
3.2.3 Related Work 54
3.3. An RCU Extension for NUMA Systems 57
3.3.1 Root Cause of HTM Performance Degradation on NUMA systems 57
3.3.2 Design of RCX 62
3.3.3 Implementation 70
3.4 Evaluation 71
3.4.1 Evaluation Setup 71
3.4.2 Micro-benchmarks 72
3.4.3 Macro-benchmark 76
3.5 Conclusion. 80
Chapter 4 Conculsion 81
-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectMulticore-
dc.subjectParallelism-
dc.subjectRCU-
dc.subjectFragmentation-
dc.subjectMemory-
dc.subjectOperating System-
dc.subject.ddc621.39-
dc.titleOptimizing Memory Management Systems for High Performance and Scalability-
dc.title.alternative높은 성능과 확장성을 위한 메모리 관리 시스템 최적화-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorSeongJae Park-
dc.contributor.department공과대학 컴퓨터공학부-
dc.description.degreeDoctor-
dc.date.awarded2019-08-
dc.identifier.uciI804:11032-000000156341-
dc.identifier.holdings000000000040▲000000000041▲000000156341▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share