Publications

Detailed Information

I/O Performance Optimization Schemes for Manycore HPC Systems

Cited 0 time in Web of Science Cited 0 time in Scopus
Authors

Bang Jiwoo

Advisor
엄현상
Issue Date
2023-08
Publisher
Seoul National University
Keywords
High Performance ComputingManycore ArchitectureFine-grained LockLustre File SystemZFSUnsupervised LearningPrediction Model
Abstract
High-performance computing (HPC) systems are composed of thousands of compute nodes, storage systems, and high-speed networks, which provide multiple layers of I/O stacks with high complexity. To meet the increasing demand for data access performance in applications run on HPC systems, an efficient design of the HPC memory management system and storage file system is becoming more important. Moreover, HPC users need to be properly guided with optimal system configuration settings to avoid significant fluctuations in performance.

In this dissertation, our first focus is on reducing lock contention on the memory management system of an HPC manycore architecture. One of the critical sections that causes severe lock contention in the I/O path is the page management system, which uses multiple Least Recently Used (LRU) lists with a single lock instance. To solve this problem, we propose the Finer-LRU scheme, which optimizes the page reclamation process by splitting LRU lists into multiple sub-lists, each with its lock instance. Our evaluation result shows that the Finer-LRU scheme can improve sequential write throughput by 57.03% and reduce latency by 98.94% compared to the baseline Linux kernel version 5.2.8 in the Intel Knights Landing (KNL) architecture.

We also analyze the root cause of low I/O performance on a ZFS-based Lustre file system and propose a novel ZFS scheme, dynamic-ZFS, which combines two optimization approaches. The experimental results show that our approach can improve the sequential I/O performance by an average of 37%. We demonstrate that dynamic-ZFS can deliver I/O performance comparable to that of ldiskfs-based Lustre while still providing a multitude of beneficial features.

Finally, we employ multiple machine learning approaches to perform an in-depth analysis of I/O behaviors in HPC applications and to search for optimal configuration settings for jobs sharing similar I/O characteristics. Improved by a maximum of 0.07 R-squared score, our overall results show that jobs run on HPC systems can obtain the predicted I/O performance for different configuration parameters with high accuracy using the proposed machine learning-based prediction models.
Language
eng
URI
https://dcollection.snu.ac.kr/common/orgView/000000178050

https://hdl.handle.net/10371/196498
Files in This Item:
Appears in Collections:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share