Publications
Detailed Information
NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models
DC Field | Value | Language |
---|---|---|
dc.contributor.author | Kim, Joonsung | - |
dc.contributor.author | Hur, Suyeon | - |
dc.contributor.author | Lee, Eunbok | - |
dc.contributor.author | Lee, Seungho | - |
dc.contributor.author | Kim, Jangwoo | - |
dc.date.accessioned | 2022-10-05T04:09:56Z | - |
dc.date.available | 2022-10-05T04:09:56Z | - |
dc.date.created | 2022-07-22 | - |
dc.date.issued | 2021-10 | - |
dc.identifier.citation | 30TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2021), pp.75-89 | - |
dc.identifier.issn | 1089-795X | - |
dc.identifier.uri | https://hdl.handle.net/10371/185289 | - |
dc.description.abstract | Emerging natural language processing (NLP) models have become more complex and bigger to provide more sophisticated NLP services. Accordingly, there is also a strong demand for scalable and flexible computer infrastructure to support these large-scale, complex, and diverse NLP models. However, existing proposals cannot provide enough scalability and flexibility as they neither identify nor optimize a wide spectrum of performance-critical operations appearing in recent NLP models and only focus on optimizing specific operations. In this paper, we propose NLP-Fast, a novel system solution to accelerate a wide spectrum of large-scale NLP models. NLP-Fast mainly consists of two parts: (1) NLP-Perf : an in-depth performance analysis tool to identify critical operations in emerging NLP models and (2) NLP-Opt: three end-to-end optimization techniques to accelerate the identified performance-critical operations on various hardware platforms (e.g., CPU, GPU, FPGA). In this way, NLP-Fast can accelerate various types of NLP models on different hardware platforms by identifying their critical operations through NLP-Perf and applying the NLP-Opt's holistic optimizations. We evaluate NLP-Fast on CPU, GPU, and FPGA, and the overall throughputs are increased by up to 2.92x, 1.59x, and 4.47x over each platform's baseline. We release NLP-Fast to the community so that users are easily able to conduct the NLP-Fast's analysis and apply NLP-Fast's optimizations for their own NLP applications. | - |
dc.language | 영어 | - |
dc.publisher | IEEE COMPUTER SOC | - |
dc.title | NLP-Fast: A Fast, Scalable, and Flexible System to Accelerate Large-Scale Heterogeneous NLP Models | - |
dc.type | Article | - |
dc.identifier.doi | 10.1109/PACT52795.2021.00013 | - |
dc.citation.journaltitle | 30TH INTERNATIONAL CONFERENCE ON PARALLEL ARCHITECTURES AND COMPILATION TECHNIQUES (PACT 2021) | - |
dc.identifier.wosid | 000758464500006 | - |
dc.identifier.scopusid | 2-s2.0-85125736429 | - |
dc.citation.endpage | 89 | - |
dc.citation.startpage | 75 | - |
dc.description.isOpenAccess | N | - |
dc.contributor.affiliatedAuthor | Kim, Jangwoo | - |
dc.type.docType | Proceedings Paper | - |
dc.description.journalClass | 1 | - |
- Appears in Collections:
- Files in This Item:
- There are no files associated with this item.
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.