Publications

Detailed Information

BBOS: Efficient HPC Storage Management via Burst Buffer Over-Subscription

DC Field Value Language
dc.contributor.authorSung, Hanul-
dc.contributor.authorBang, Jiwoo-
dc.contributor.authorKim, Chungyong-
dc.contributor.authorKim, Hyung-Sin-
dc.contributor.authorSim, Alexander-
dc.contributor.authorLockwood, Glenn K.-
dc.contributor.authorEom, Hyeonsang-
dc.date.accessioned2022-10-20T00:23:55Z-
dc.date.available2022-10-20T00:23:55Z-
dc.date.created2022-06-07-
dc.date.created2022-06-07-
dc.date.issued2020-05-
dc.identifier.citationProceedings - 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020, pp.142-151-
dc.identifier.urihttps://hdl.handle.net/10371/186537-
dc.description.abstract© 2020 IEEE.To avoid access to PFS, dedicated BB allocation is preferred despite of severe BB underutilization. Recently, new all-flash HPC storage systems with integrated BB and PFS are proposed, which speed up access to PFS. For this reason, we adopt BB over-subscription allocation method by allowing HPC applications to use BB only for I/O phase for improving BB utilization. Unfortunately, BB over-subscription aggravates I/O interference and demotion overhead from BB to PFS, resulting in degraded performance. To minimize the performance degradation, we develop an I/O scheduler to prevent I/O congestion and a new transparent data management system based on checkpoint/restart characteristics of HPC applications. With the proposed approach, not only the BB utilization can be improved, but also high performance of applications is achieved. In our experiments, we find that BB utilization is improved at least 2.2x, and more stable and higher checkpoint performance is guaranteed compared to other approaches. Besides, we achieve up to 96.4% hit ratio of restart requests on BB and up to 3.1x higher restart performance than others.-
dc.language영어-
dc.publisherInstitute of Electrical and Electronics Engineers Inc.-
dc.titleBBOS: Efficient HPC Storage Management via Burst Buffer Over-Subscription-
dc.typeArticle-
dc.identifier.doi10.1109/CCGrid49817.2020.00-79-
dc.citation.journaltitleProceedings - 20th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing, CCGRID 2020-
dc.identifier.wosid000649540400015-
dc.identifier.scopusid2-s2.0-85089083743-
dc.citation.endpage151-
dc.citation.startpage142-
dc.description.isOpenAccessY-
dc.contributor.affiliatedAuthorKim, Hyung-Sin-
dc.contributor.affiliatedAuthorEom, Hyeonsang-
dc.type.docTypeConference Paper-
dc.description.journalClass1-
dc.subject.keywordAuthorBurst Buffer-
dc.subject.keywordAuthorCheckpoint-
dc.subject.keywordAuthorDemotion-
dc.subject.keywordAuthorOver-subscription-
dc.subject.keywordAuthorPFS-
dc.subject.keywordAuthorRestart-
Appears in Collections:
Files in This Item:
There are no files associated with this item.

Related Researcher

  • Graduate School of Data Science
Research Area Distributed machine learning, Edge, Mobile AI

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share