NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models

Kim, Joonsung; Hur, Suyeon; Lee, Eunbok; Lee, Seungho; Kim, Jangwoo

doi:10.1109/PACT52795.2021.00013

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models

DC Field	Value	Language
dc.contributor.author	Kim, Joonsung	-
dc.contributor.author	Hur, Suyeon	-
dc.contributor.author	Lee, Eunbok	-
dc.contributor.author	Lee, Seungho	-
dc.contributor.author	Kim, Jangwoo	-
dc.date.accessioned	2022-10-05T04:09:56Z	-
dc.date.available	2022-10-05T04:09:56Z	-
dc.date.created	2022-07-22	-
dc.date.issued	2021-10	-
dc.identifier.citation	30TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2021), pp.75-89	-
dc.identifier.issn	1089-795X	-
dc.identifier.uri	https://hdl.handle.net/10371/185289	-
dc.description.abstract	Emerging natural language processing (NLP) models have become more complex and bigger to provide more sophisticated NLP services. Accordingly, there is also a strong demand for scalable and flexible computer infrastructure to support these large-scale, complex, and diverse NLP models. However, existing proposals cannot provide enough scalability and flexibility as they neither identify nor optimize a wide spectrum of performance-critical operations appearing in recent NLP models and only focus on optimizing specific operations. In this paper, we propose NLP-Fast, a novel system solution to accelerate a wide spectrum of large-scale NLP models. NLP-Fast mainly consists of two parts: (1) NLP-Perf : an in-depth performance analysis tool to identify critical operations in emerging NLP models and (2) NLP-Opt: three end-to-end optimization techniques to accelerate the identified performance-critical operations on various hardware platforms (e.g., CPU, GPU, FPGA). In this way, NLP-Fast can accelerate various types of NLP models on different hardware platforms by identifying their critical operations through NLP-Perf and applying the NLP-Opt's holistic optimizations. We evaluate NLP-Fast on CPU, GPU, and FPGA, and the overall throughputs are increased by up to 2.92x, 1.59x, and 4.47x over each platform's baseline. We release NLP-Fast to the community so that users are easily able to conduct the NLP-Fast's analysis and apply NLP-Fast's optimizations for their own NLP applications.	-
dc.language	영어	-
dc.publisher	IEEE COMPUTER SOC	-
dc.title	NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models	-
dc.type	Article	-
dc.identifier.doi	10.1109/PACT52795.2021.00013	-
dc.citation.journaltitle	30TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2021)	-
dc.identifier.wosid	000758464500006	-
dc.identifier.scopusid	2-s2.0-85125736429	-
dc.citation.endpage	89	-
dc.citation.startpage	75	-
dc.description.isOpenAccess	N	-
dc.contributor.affiliatedAuthor	Kim, Jangwoo	-
dc.type.docType	Proceedings Paper	-
dc.description.journalClass	1	-

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Journal Papers (저널논문_전기·정보공학부)

Files in This Item:: There are no files associated with this item.

Altmetrics

Item View & Download Count

Show Simple Item Record

Find it @ SNU

트윗하기

SNS Share