ADC-PIM: Accelerating Convolution on the GPU via In-Memory Approximate Data Comparison

Choi, Jungwoo; Lee, Hyuk-Jae; Rhee, Chae Eun

doi:10.1109/JETCAS.2022.3167391

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

ADC-PIM: Accelerating Convolution on the GPU via In-Memory Approximate Data Comparison

DC Field	Value	Language
dc.contributor.author	Choi, Jungwoo	-
dc.contributor.author	Lee, Hyuk-Jae	-
dc.contributor.author	Rhee, Chae Eun	-
dc.date.accessioned	2022-10-05T04:10:04Z	-
dc.date.available	2022-10-05T04:10:04Z	-
dc.date.created	2022-07-11	-
dc.date.issued	2022-06	-
dc.identifier.citation	IEEE Journal on Emerging and Selected Topics in Circuits and Systems, Vol.12 No.2, pp.458-471	-
dc.identifier.issn	2156-3357	-
dc.identifier.uri	https://hdl.handle.net/10371/185300	-
dc.description.abstract	Recently, convolutional neural networks (CNN) have been widely used in image processing and computer vision. GPUs are often used to accelerate the CNN, but performance is limited by high computational costs and memory usage of the convolution. Prior studies exploited approximate computing to reduce the computational costs. However, they only reduced the amount of the computation, thereby its performance is bottlenecked by the memory bandwidth due to an increased memory intensity. In addition, load imbalance between warps caused by approximation also inhibits the performance improvement. In this paper, we propose a processing-in-memory (PIM) solution that reduces the amount of data movement and computation through the Approximate Data Comparison (ADC-PIM). Instead of determining the value similarity after loading the data to the GPU, the ADC-PIM unit located on 3D-stacked memory compares the similarity and transfers only the selected representative data to the GPU. The GPU performs convolution on the representative data transferred from the ADC-PIM, and reuses the calculated results based on the similarity information. To reduce the increase in memory latency caused by the in-memory data comparison, we propose a two-level PIM architecture that exploits both the DRAM bank and TSV stage. By dividing the comparisons into multiple banks and then merging the results on the TSV stage, the ADC-PIM effectively hides the delay caused by the comparisons. To ease the load balancing on the GPU, the ADC-PIM performs data reorganization by assigning the representative data to addresses that are computed based on the comparison result. Experimental results show that the proposed ADC-PIM provides a 43% speedup and 32% energy saving with less than a 1% accuracy drop.	-
dc.language	영어	-
dc.publisher	IEEE Circuits and Systems Society	-
dc.title	ADC-PIM: Accelerating Convolution on the GPU via In-Memory Approximate Data Comparison	-
dc.type	Article	-
dc.identifier.doi	10.1109/JETCAS.2022.3167391	-
dc.citation.journaltitle	IEEE Journal on Emerging and Selected Topics in Circuits and Systems	-
dc.identifier.wosid	000811585100014	-
dc.identifier.scopusid	2-s2.0-85128668009	-
dc.citation.endpage	471	-
dc.citation.number	2	-
dc.citation.startpage	458	-
dc.citation.volume	12	-
dc.description.isOpenAccess	N	-
dc.contributor.affiliatedAuthor	Lee, Hyuk-Jae	-
dc.type.docType	Article	-
dc.description.journalClass	1	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Journal Papers (저널논문_전기·정보공학부)

Files in This Item:: There are no files associated with this item.

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share