Publications

Detailed Information

A Static Analyzer for Detecting Tensor Shape Errors in Deep Neural Network Training Code : 심층신경망 학습 코드의 텐서 형상 에러를 찾아내는 정적분석기

DC Field Value Language
dc.contributor.advisor허충길-
dc.contributor.author주호영-
dc.date.accessioned2022-12-29T07:43:55Z-
dc.date.available2022-12-29T07:43:55Z-
dc.date.issued2022-
dc.identifier.other000000172063-
dc.identifier.urihttps://hdl.handle.net/10371/187766-
dc.identifier.urihttps://dcollection.snu.ac.kr/common/orgView/000000172063ko_KR
dc.description학위논문(석사) -- 서울대학교대학원 : 공과대학 컴퓨터공학부, 2022. 8. 허충길.-
dc.description.abstractThis thesis presents an automatic static analyzer PyTea that detects tensor-shape errors in PyTorch code. The tensor-shape error is critical in the deep neural net code; much of the training cost and intermediate results are to be lost once a tensor shape mismatch occurs in the midst of the training phase. Given the input PyTorch source, PyTea statically traces every possible execution path, collects tensor shape constraints required by the tensor operation sequence of the path, and decides if the constraints are unsatisfiable (hence a shape error can occur). PyTeas scalability and precision hinges on the characteristics of real-world PyTorch applications: the number of execution paths after PyTeas conservative pruning rarely explodes and loops are simple enough to be circumscribed by symbolic abstraction. PyTea is tested against the projects in the official PyTorch repository and some tensor-error code questioned in the StackOverflow. PyTea successfully detects tensor shape errors in these codes, each within a few seconds.-
dc.description.abstract본 논문은 PyTorch 코드에서 텐서 형상 오류를 검출하는 자동 정적분석기 PyTea를 소개한다. 텐서 형상 오류는 한번 일어나면 많은 학습 시간과 중간 결과를 잃어버릴 수 있기에 심층신경망 학습에 있어 매우 중요한 부분을 차지한다. PyTea는 PyTorch 코드를 받아 모든 가능한 실행경로를 정적으로 분석하고, 각 경로마다 텐서 연산이 오류없이 수행될 수 있는 텐서 형상의 조건을 모은 뒤, 그 조건들을 전부 만족시킬수 있는지 없는지를 판단하여 텐서 형상 오류가 있는지를 감지한다. PyTea의 확장성과 정확성은 PyTea의 심볼릭 축약 및 경로 단순화 후 남은 경로 갯수가 많지 않으며, 반복문의 실행 횟수도 충분히 작다는 실제 PyTorch 프로그램의 특성에 기반한다. PyTea는 공식 PyTorch 코드 저장소와 StackOverflow의 텐서 오류 코드를 기반으로 테스트 되었으며, 이러한 실험에서 모두 수 초 이내로 텐서 형상 오류를 검출하였다.-
dc.description.tableofcontentsAbstract 1

Chapter 1 Introduction 8
1.1 Our Goal 8
1.2 Structure of PyTorch Programs 8
1.3 Tensor Shape Errors 9

Chapter 2 Overview of PyTea Analyzer 15
2.1 Assumptions 16
2.2 Handling path explosions 17
2.3 Handling Loops 17

Chapter 3 Analysis Steps 19
3.1 PyTea IR 19
3.2 Constraint generation 20
3.2.1 Constraint generation rules for PyTea IR 22
3.2.2 Constraint types 22
3.2.3 Handling path explosion 25
3.3 Constraint check 26
3.3.1 Online constraint check 26
3.3.2 Offline constraint check 26

Chapter 4 Evaluation 28
4.1 Results 31
4.1.1 PyTea for PyTorch Examples 31
4.1.2 PyTea for StackOverflow questions 32
4.2 Discovered Errors in PyTorch Applications 33
4.2.1 Detecting insufficient data preprocessing 34
4.2.2 Handling path explosion 34
4.2.3 Handling both regular and residual batch sizes in the training loop 35
4.3 Limitation of PyTea 36

Chapter 5 Related Works and Conclusion 38

Chapter A Appendix 41
A.1 Supported Python syntax 41
A.2 Evaluation details 43
A.2.1 Specification of injected shape error 43
A.2.2 Analysis result of complete PyTorch project 44
A.2.3 Complete command-line arguments 45
A.2.4 Code modification points 45
A.2.5 Experiment comparison criteria 46
A.3 Complete definitions of PyTea IR syntax and semantics 47
A.3.1 Syntax 47
A.3.2 Constraint 48
A.3.3 Domain 49

초록 56
Acknowledgements 57
-
dc.format.extent57-
dc.language.isoeng-
dc.publisher서울대학교 대학원-
dc.subjectStaticanalysis-
dc.subjectdeeplearning-
dc.subjecttensorshapeerror-
dc.subjectSMTsolver-
dc.subjectPython-
dc.subjectPyTorch-
dc.subject.ddc621.39-
dc.titleA Static Analyzer for Detecting Tensor Shape Errors in Deep Neural Network Training Code-
dc.title.alternative심층신경망 학습 코드의 텐서 형상 에러를 찾아내는 정적분석기-
dc.typeThesis-
dc.typeDissertation-
dc.contributor.AlternativeAuthorHo Young Jhoo-
dc.contributor.department공과대학 컴퓨터공학부-
dc.description.degree석사-
dc.date.awarded2022-08-
dc.identifier.uciI804:11032-000000172063-
dc.identifier.holdings000000000048▲000000000055▲000000172063▲-
Appears in Collections:
Files in This Item:

Altmetrics

Item View & Download Count

  • mendeley

Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.

Share