Designing FPGA-based modular architectures for NLP models

Abstract: Neural networks based natural language processing (NLP) models (e.g., LSTM, BERT) are emerging as promising solutions for NLP tasks. When running NLP models, we should support fast inference in a single batch environment, as NLP tasks require immediate responses. However, it is difficult to accelerate NLP models in a single batch due to the three challenges that follow; (1) a wide range of dimensions
and irregular matrix operations, (2) non-negligible vector operations latency, and (3) heterogeneity of vector operations.
In this paper, we propose FlexRun, an FPGA-based modular architecture approach to solve three challenges. FlexRun reconfigures the architecture adaptively to the input models. To this end, FlexRun consists of three parts. First, FlexRun:Architecture is a base architecture template with reconfigurable parameters. Next, in FlexRun:Algorithm, we define the design space and suggest algorithms to find the best design points in the design space. Lastly, FlexRun:Automation automatically finds the best design and implements the resulting architecture. For evaluation, we use Intels high-end FPGAs and achieve 2.69 and 1.44 speedup compared to V100 and Brainwave-like FPGA baseline, respectively.
최근 딥러닝 기반의 자연어 처리 모델이 음성인식, 번역과 같은 자연어 처리 과제에 적극적으로 활용되고 있다. 자연어 처리 과제는 주로 즉각적인 반응을 요구하기 때문에 단일 배치 환경에서 자연어 처리 모델의 빠른 추론을 지원하는 것이 필수적이다. 하지만 자연어 처리 모델이 가진 특성들로 인해 단일 배치에서 자연어 처리를 가속하는 것은 힘들다. 해당 특성들은 다음과 같다; (1) 넓은 범위의 디멘션과 불균형한 매트릭스 연산, (2) 벡터 연산의 오버헤드, 그리고 (3) 벡터연산의 다양성. 본 학위논문에서는 FlexRun을 제안하여 세 가지 특성들을 해결하고 단일 배치 환경에서 자연어 처리의 추론을 가속한다. FlexRun은 FPGA의 높은 reconfigurability를 활용하여 주어진 타깃 모델에 맞게 아키텍처를 디자인한다. FlexRun에는 세 가지 기술이 있다. 첫 번째는 FPGA를 기반으로 하며 재구성 가능한 요소들로 이루어진 베이스 아키텍처 템플릿이다. 두 번째는 디자인 스페이스를 정의하고 디자인 스페이스에서 타깃 모델에 따라 최적의 디자인 포인트를 찾는 알고리즘이다. 마지막으로는 최적의 디자인을 찾는 것에서부터 아키텍처를 구현하는 일련의 과정들을 자동화하는 툴이다. 본 논문에서는 FlexRun을 적용하여 GPU 베이스라인과 FPGA 기반의 Brainwave-like 베이스라인과 비교해 유의미한 성능향상을 보여준다.

Language: kor

URI: https://hdl.handle.net/10371/181138

https://dcollection.snu.ac.kr/common/orgView/000000169616

Files in This Item:

000000169616.pdf 3.04 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Electrical and Computer Engineering (전기·정보공학부)
  - Theses (Master's Degree_전기·정보공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share