Development of GPU-Based Deterministic and Probabilistic Direct Whole-core Calculation Systems

Abstract: The significant advances in CPU computing power through the past decades have enabled solving the entire reactor core directly with the transport methods, which is referred to as direct whole-core calculation. Since the direct whole-core calculation can provide high-fidelity and detailed solutions for the safety analysis of operating reactors as well as for the design of advanced reactors, there have been continued high demands for fast and accurate direct whole-core calculations. The CPU-based computing platforms are, however, not yet fast enough to employ the direct whole-core calculation methods in routine design analyses. Therefore, the legacy two-step methods are still used as the primary nuclear design tools in the industries.
Unfortunately, further improvements of the CPU-based computing platforms to make the direct whole-core calculations affordable in the industry is hard to expect because of the limitations imposed by power and memory barriers. Thus, a complete changeover to a new computing platforms is required to achieve a paradigm shift to utilize the direct whole-core calculation methods in routine core designs.
In this regard, this research develops GPU acceleration methods and frameworks for the 2D/1D and continuous-energy Monte Carlo (MC) methods, which are the representative deterministic and probabilistic direct whole-core calculation methods, and lays the foundations for the practical use of direct whole-core calculations. The recent rise of artificial intelligence (AI) and big data industries and enhancements of display resolutions are all causing tremendous amount of computing power demands in both sides of scientific computing and graphics processing, which is boosting the advances in GPU computing technologies. A single consumer-grade GPU is already as powerful as hundreds of server CPU cores, and cutting-edge supercomputers are relying on the high power efficiency of GPUs to achieve their target computational capacity. By taking advantage of this megatrend in the computing paradigm, this research aims to achieve not only time-wise but also cost-wise feasibility of the direct whole-core calculation methods by exploiting consumer-grade GPUs for the GPU acceleration.
This research suggests algorithms and schemes for an efficient GPU acceleration of the 2D/1D and continuous-energy MC methods. However, this research does not end up as a mere collection of fragmentary algorithms, but integrates the algorithms to constitute complete solution frameworks. For this, this research is performed with production-grade codes as opposed to the previous researches which ended up with limited implementations on mock-up codes or proxy applications. The study on the GPU acceleration of 2D/1D method is performed with the nTRACER code, and the entire solution procedure of continuous-energy MC method is accelerated by GPUs through the development of the GPU-based continuous-energy MC code PRAGMA. Then, the effectiveness of developed algorithms is demonstrated by the applications to real engineering problems.
Specifically, the steady-state calculation module of nTRACER encompassing the planar MOC, CMFD, and axial solvers becomes the target of GPU acceleration in the 2D/1D method. MOC ray tracing and CMFD linear system solution schemes are optimized for massive parallelization, and CPU – GPU concurrency is exploited in the MOC calculation to take advantage of the heterogeneous computing environment. In the CMFD calculation, massively parallelizable DSPAI preconditioner substitutes for the LU-type preconditioners and iterative refinement technique is introduced to implement mixed precision arithmetic. In the axial solver, an axial MOC solver with improved parallel efficiency, accuracy, and stability is developed.
In the continuous-energy MC method, namely in the development of PRAGMA, optimization of cross section look-up and vectorization of random walk becomes the key of GPU acceleration. The unionized grid method is improved by a linear hashing scheme and nuclide-wise temperature-dependent grid collapse. A vectorized event-based tracking algorithm is developed, and the region partitioning and energy sort schemes are employed together with the event-based tracking algorithm to increase the chance of memory coalescing significantly in the cross section look-up for the depleted fuel calculations.
Various schemes for the application of GPU-based continuous-energy MC method to real operating reactors are also developed. For an effective treatment of resonance scattering, the RST target velocity sampling scheme which resolves the drawback of DBRC and WCM is developed, and a domain decomposition scheme to realize large-scale power reactor calculations with limited GPU memory capacity is introduced. In addition, a unique scheme named MSC is employed for a practical MC depletion calculation, and the localized delta-tracking scheme can treat exactly the temperature distributions and material variations in the fuel pellets without additional cost, which enables efficient thermal feedback and depletion calculations. CMFD and ramp-up fission source convergence acceleration schemes are also introduced to improve the practicality of massive particle simulations.
The results of this research could demonstrate a high potential of the direct whole-core calculation methods as the practical nuclear design tools. A whole-core steady-state calculation employing the 2D/1D method could be performed in a few minutes, and whole-core massive particle MC simulations employing billions of particles has become a routine task that can be done in minutes. All these accomplishments were made on a practical computing cluster mounting dozens of consumer-grade GPUs. The performance of GPUs is still under an exponential growth and the practicality of developed frameworks will be continuously improved as the time goes.
지난 수십 년간 이어진 CPU 계산 성능의 괄목할 발전은 전체 노심을 수송 해법을 이용하여 직접적으로 계산하는 직접 전노심 계산을 가능하게 했다. 직접 전노심 계산은 가동 원전 안전 해석 및 신형 원자로 설계에 있어서 고신뢰도의 정밀한 계산 결과를 제공해주기 때문에, 정확하고 빠른 직접 전노심 계산에 대한 지속적인 수요가 존재해 왔다. 그러나 CPU 계산 환경은 직접 전노심 계산을 일상적인 설계 해석에서 사용하기에는 여전히 충분히 빠르지 않다. 따라서 전통적인 이단계 기법이 여전히 산업계에서 주된 핵설계 도구로서의 역할을 하고 있다.
불행하게도, 직접 전노심 계산을 산업계에서 가용하게 해줄 CPU 계산 환경의 추가적인 발전은 전력 및 메모리 장벽에 의한 한계로 더 이상 기대하기 어렵다. 따라서 기존의 이단계 계산에서 직접 전노심 계산으로의 근본적인 핵설계 패러다임 전환을 이룩하기 위해 새로운 계산 환경으로의 전면적인 이행이 필요하다.
이런 점에서, 본 연구는 결정론적 및 확률론적 직접 전노심 계산의 대표 방법인 2D/1D 및 연속에너지 몬테칼로 기법의 GPU 가속 방법 및 체계를 개발하고 직접 전노심 계산의 실용화를 위한 초석을 다진다. 최근 대두된 인공지능 및 빅데이터 산업과 디스플레이 해상도의 향상은 과학 계산과 그래픽 처리의 양면에서 막대한 전산 수요를 유발하고 있으며, 이는 GPU 컴퓨팅 기술의 발전을 가속화하고 있다. 개별 소비자용 GPU는 이미 서버 CPU 수백 코어에 필적할 만큼 강력하며, 최첨단 슈퍼컴퓨터들은 GPU의 높은 전력 효율에 의존하여 목표한 성능을 달성하고 있다. 이러한 컴퓨팅 패러다임의 조류에 편승하여, 본 연구는 소비자용 GPU를 이용한 GPU 가속화로 직접 전노심 계산 기법의 시간 측면 현실성뿐만 아니라 비용 측면 현실성도 달성하고자 한다.
본 연구는 2D/1D 방법과 연속에너지 몬테칼로 방법의 효율적인 GPU 가속을 위한 알고리즘과 기법을 제시한다. 그러나 본 연구는 단편적인 알고리즘들의 모음에 그치지 않고 이들을 통합하여 완전한 해석 체계들을 구성한다. 이를 위해 본 연구는 간이 코드를 이용한 제한적인 구현에 그쳤던 기존의 연구들과 달리 상용 수준의 코드들을 이용하여 수행된다. 2D/1D 방법의 GPU 가속에 대한 연구는 nTRACER를 기반으로 수행되며 GPU 기반 연속에너지 몬테칼로 코드 PRAGMA 개발을 통해 연속에너지 몬테칼로 방법의 전 해석 과정을 GPU 가속화한다. 그리고 실제 공학 문제에 대한 적용을 시연하여 개발된 알고리즘의 효과성을 입증한다.
구체적으로, 2D/1D 방법에서는 층별 MOC, CMFD, 그리고 축방향 계산을 아우르는 nTRACER 정상상태 계산 모듈이 GPU 가속의 대상이 된다. MOC 선추적 기법 및 CMFD 선형계 해법이 대규모 병렬화에 최적화되고, MOC 계산에서는 이종 계산 환경의 이점을 살려 CPU – GPU 동시 계산이 활용된다. CMFD 계산에서는 대규모 병렬화가 가능한 DSPAI 선조건자가 LU 계열의 선조건자를 대체하며 혼합 정밀도 구현을 위한 반복적 보정 기법이 도입된다. 축방향 계산에서는 병렬 효율, 정확성, 그리고 안정성이 개선된 축방향 MOC 해법을 개발한다.
연속에너지 몬테칼로 방법에서는, 즉 PRAGMA 개발에서는 반응 단면적 검색 최적화와 무작위 행보의 벡터화가 GPU 가속의 핵심이 된다. 통합그리드 방법이 선형 해싱 기법과 핵종 별 온도 간 그리드 구조 병합을 통해 개선된다. 벡터화된 사건 기반 입자 추적 알고리즘이 도입되며, 지역 구분 및 에너지 정렬 기법이 사건 기반 알고리즘과 함께 사용되어 연소 핵연료 계산에서 반응 단면적 검색의 메모리 코얼레싱 확률을 획기적으로 향상시킨다.
GPU 기반 연속에너지 몬테칼로 계산의 실제 가동 원전 적용을 위한 다양한 기법들도 개발된다. 효과적인 공명 산란 처리를 위해 DBRC와 WCM의 단점을 해결한 RST 표적핵 속도 추출법을 개발하고, 제한된 GPU 메모리 용량으로 대규모 동력로 계산을 실현하기 위한 영역 분할 계산법이 도입된다. 더하여, 실용적인 몬테칼로 연소 계산을 위한 독창적인 MSC 기법이 사용되며, 국소 Delta-tracking 기법은 추가적인 비용 없이 연료 소자 내 온도 분포와 조성 변화를 정확하게 처리할 수 있어 효율적인 열 궤환 및 연소 계산을 가능하게 한다. 대량 입자 계산의 실용성을 높여주는 CMFD 및 Ramp-up 선원 수렴 가속 기법도 도입된다.
본 연구의 결과는 실용적인 핵설계 도구로서 직접 전노심 계산 기법의 높은 잠재력을 보여주었다. 2D/1D 방법을 이용한 전노심 정상상태 계산은 수 분내에 수행될 수 있었고, 수십억 개의 입자를 사용한 대량입자 전노심 몬테칼로 계산은 분 단위에 수행될 수 있는 일상적인 작업이 되었다. 모든 이러한 성과는 수십 장의 상용 GPU를 장착한 실용적인 계산 클러스터로 달성되었다. GPU의 성능은 여전히 지수적으로 성장하고 있으며, 따라서 개발된 해석 체계들의 실용성 또한 시간이 지남에 따라 더욱 향상될 것이다.

Language: English

URI: https://hdl.handle.net/10371/174863

https://dcollection.snu.ac.kr/common/orgView/000000164514

Files in This Item:

000000164514.pdf 8.83 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Energy Systems Engineering (에너지시스템공학부)
  - Nuclear Engineering (원자핵공학전공)
    - Theses (Ph.D. / Sc.D._원자핵공학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share