S3NAS: Fast NPU-aware Neural Architecture Search Methodology

Abstract: As the application area of convolutional neural networks (CNN) is growing in embedded devices, it becomes popular to use a hardware CNN accelerator, called neural processing unit (NPU), to achieve higher performance per watt than CPUs or GPUs. Recently, automated neural architecture search (NAS) emerges as the default technique to find a state-of-the-art convolutional neural network (CNN) architecture with higher accuracy than manually-designed architectures for image classification. In this paper, we present a fast NPU-aware NAS methodology, called S3NAS, reflecting the latest research results. It consists of three steps: supernet design, our modified Single-Path NAS for fast architecture exploration, scaling and post-processing. First, we design a novel supernet, an over-parameterized network containing candidate networks, to obtain better CNN on the target NPU. We assign the number of blocks to each stage differently from conventional One-Shot NAS methods, and add depthwise convolution with multiple kernel sizes (MixConv) to the search space. Next, for a fast neural architecture search, we apply a modified Single-Path NAS technique to the proposed supernet structure. In this step, we assume a shorter latency constraint than the required to reduce the search space and the search time. The last step is to scale up the network maximally within the latency constraint. With the proposed methodology, we are able to find a network in 4 hours using TPUv3, which shows 82.70% top-1 accuracy on ImageNet with 11.63 ms latency. Search code is released at https://github.com/cap-lab/S3NAS
임베디드 기기에서 컨볼루션 신경망 (CNN) 기반의 딥러닝 응용이 늘어남에 따라, CPU나 GPU보다 와트당 성능이 더 높은, 뉴럴 프로세싱 유닛 (NPU)이라고 불리는 CNN 하드웨어 가속기가 인기를 얻고 있다. 최근에는, 자동 신경망 구조 탐색 (NAS)이 등장해 인간이 설계한 신경망 구조보다 더 높은 성능을 보이는 최첨단 컨볼루션 신경망 (CNN) 구조를 찾는 기술로 널리 쓰이고 있다. 본 논문에서는 최신 연구 결과들을 반영해, S3NAS라는 빠른 NPU 친화적 신경망 구조 탐색 기법을 제안한다. S3NAS는 세 가지 단계로 구성된다. 첫째, supernet이라고 불리는, 후보 신경망 구조를 모두 포함하는 신경망을 새롭게 디자인한다. 각 stage에 할당되는 block의 개수를 기존 One-Shot NAS 기법들과는 다르게 디자인했고, MixConv라고 불리는, 여러 종류의 커널 크기를 가진 depthwise convolution을 탐색 공간에 추가했다. 다음으로, 빠른 탐색을 위해, 수정된 Single-Path NAS를 제안된 supernet에 적용한다. 이때, 탐색 공간과 시간을 줄이기 위해, 제약 시간 조건을 짧게 설정하여 탐색한다. 마지막으로, 원하는 제약 시간 조건에 맞도록 탐색 된 신경망 구조의 크기를 키운다. 제안된 방법론을 사용하면 TPUv3를 사용하여 4시간 만에, ImageNet에서 82.7% 정확도와 11.63ms 지연 시간을 나타내는 신경망 구조를 찾을 수 있다. 탐색 코드는 https://github.com/cap-lab/S3NAS에 공개되었다.

Language: eng

URI: https://hdl.handle.net/10371/175426

https://dcollection.snu.ac.kr/common/orgView/000000165328

Files in This Item:

000000165328.pdf 3.59 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Computer Science and Engineering (컴퓨터공학부)
  - Theses (Master's Degree_컴퓨터공학부)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share