Comparing unified, pinned, and host/device memory allocations for memory-intensive workloads on Tegra SoC

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Comparing unified, pinned, and host/device memory allocations for memory-intensive workloads on Tegra SoC

Cited 6 time in Web of Science Cited 6 time in Scopus

Authors: Choi, Jake; You, Hojun; Kim, Chongam; Yeom, Heon Young; Kim, Yoonhee

Citation: Concurrency Computation Practice and Experience, Vol.33 No.4, p. e6018

Abstract: Edge computing focuses on processing near the source of the data. Edge computing devices using the Tegra SoC architecture provide a physically distinct GPU memory architecture. In order to take advantage of this architecture, different modes of memory allocation need to be considered. Different GPU memory allocation techniques yield different results in memory usage and execution times of identical applications on Tegra devices. In this article, we implement several GPU application benchmarks, including our custom CFD code with unified, pinned, and normal host/device memory allocation modes. We evaluate and compare the memory usage and execution time of such workloads on edge computing Tegra system-on-chips (SoC) equipped with integrated GPUs using a shared memory architecture, and non-SoC machines with discrete GPUs equipped with distinct VRAM. We discover that utilizing normal memory allocation methods on SoCs actually use double the required memory because of unnecessary device memory copies, despite being physically shared with host memory. We show that GPU application memory usage can be reduced up to 50%, and that even performance improvements can occur just by replacing normal memory allocation and memory copy methods with managed unified memory or pinned memory allocation.

Appears in Collections:

Show Full Item Record

Find it @ SNU

SNS Share