Explainable reinforcement learning and rule reduction for advanced building control

Abstract: HVAC 및 조명과 같은 기존 시스템과 간헐적 재생 에너지, 에너지 저장 시스템 등과 같은 새로운 시스템에도 대응해야 하므로 현대 건물 시스템 제어는 복잡해지고 있습니다. 이에 따라, 건물 시스템 제어기는 건물의 동적 거동에 스스로 적응할 수 있어야 하고 다목적 최적화 결과를 반영할 수 있어야 한다. 강화학습 (reinforcement learning, RL)을 사용하여 전술된 건물 제어기의 성능을 달성할 수 있다는 것은 널리 알려져 있지만, RL을 실제 건물에 적용하기 위해서는 해결해야 할 과제들이 있다: (1) RL의 초기 훈련 기간 동안 불안정한 제어는 예상치 못한 비용을 야기할 수 있다. (2) 여전히 대부분의 RL 기반 제어 전략은 일상적 실무에 적용하기에는 시설 관리자 입장에서 이해하기 어렵고 제어 전략에 대한 해석을 수행할 수 없다. RL 알고리즘을 건물 제어에 적용한다는 것은 의사결정의 주체가 인공지능이 된다는 것을 의미한다. 이때, 건물의 소유주와 운영자는 인공지능 기반 건물 제어기의 의도 및 의사결정 과정에 대한 해석 및 이해를 할 필요가 있다.
첫 번째 과제를 해결하기 위해, RL 에이전트를 사전 학습하고 이를 위해 새로운 개념의 시뮬레이션 모델인 연합 모델이 제안된다. 연합 모델은 빌딩 시스템을 물리적 인과 관계에 따라 모듈로 나누고 각 모듈을 데이터 기반 모델로 개발하여 빌딩 시스템에 대한 시뮬레이션을 수행하는 통합 데이터 기반 모델이다. 대상 건물의 냉방 시스템 시뮬레이션 모델은 6개의 모듈로 구성되고 각 모듈은 BEMS에서 수집된 데이터를 사용하여 개발된다. 연합 모델은 제1법칙 기반 시뮬레이션 모델의 한계 (예: 위상 규칙, 모델 보정)를 극복할 수 있다. Deep Q-Network (DQN)은 냉방 시스템의 동적 거동을 학습하고 건물에 냉방을 공급하는 동시에 에너지 사용을 줄일 수 있는 제어 전략을 모색하는 데 적용된다. DQN의 제어 성능을 현재 건물 운영자들이 적용하는 기존 제어 성능과 비교함으로써 RL 제어기가 시스템의 제어 효율성을 크게 개선할 수 있으며 연합 모델은 강화학습 기반 제어기의 학습을 위한 가상 환경을 제공할 수 있음을 증명한다.
DQN 에이전트의 해석성을 높이기 위해 의사결정 트리를 사용하여 에이전트의 의사결정 프로세스에 대한 설명을 추출한다. 에이전트에서 생성된 상태-작업 (state-action) 쌍이 의사결정 트리를 훈련하는 데 사용된다. 얕지만 쉽게 해석할 수 있는 모델을 사용한 사후 해석은 강화 학습의 투명성과 해석성을 향상시킨다. 또한 의사결정나무가 만든 분류 결과는 인공지능이 만든 제어 전략을 단순화시킨 'If-then' 규칙을 도출한다. 추출된 규칙 (reduced rule) 기반 제어의 성능과 DQN 제어기의 성능을 비교하여 두 제어기 사이의 에너지 절약량 차이가 2.8%로 미미함을 보인다. 즉, 규칙 기반 제어가 충분한 성능을 보인다는 것을 증명한다.
본 연구는 기축 사무실 건물의 냉방 제어를 위한 설명 가능한 RL의 적용 방안에 대해 수행된다. 의사 결정 트리를 훈련된 DQN 에이전트에 적용한 다음 일련의 단순화된 제어 규칙을 도출한다. 이 연구는 설명 가능한 강화학습을 이용한 정량화된 규칙 도출 프레임워크를 제안하고, 복잡한 강화학습 알고리즘과 비교하여 단순하지만 정량적인 평가가 수행된 규칙이 충분한 성능을 보여줄 수 있음을 보여준다. 이 연구의 의의는 건물 통제에 대한 정량적 평가를 통해 규칙을 도출하는 방법을 제안하는 데 있다.
Building controls are becoming complicated because modern building systems must respond to not only conventional systems like HVAC and lighting, but also to novel systems such as intermittent renewables, energy storage systems, and more. Therefore, the advanced building controllers must balance the trade-off between multiple objectives and automatically adapt to dynamic environment. Although it is widely acknowledged that reinforcement learning (RL) can be beneficially used for better building control, there are several challenges that should be addressed for real life application of RL: (1) unstable and poor control actions during early training period of RL may cause unexpected costs; (2) many RL-based control actions still remain unexplainable for daily practice of facility managers. By applying RL algorithms as artificial intelligences that are the subject of decision-making, owners and operators of buildings need to be reassured about the controllers intentions.
To address the first challenge, federated model, a novel concept of simulation model, is proposed for pre-training RL agents. The federated model is an integrated data-driven model that divides a building system into several modules based on physical causality and develops each module into a data-driven model to perform simulations on building systems. A federated model of a complex cooling system of a target building is realized using six modules, each developed using data gathered from BEMS. By developing the federated model, limitations of physics-based simulation models (eg. topology rules, model calibration) are overcome. Deep Q-network (DQN) is applied to learn the dynamics of the cooling system and explore control strategies that can reduce energy use while providing cold for the building. By comparing the control performance of DQN with the performance of baseline control, it is shown that RL controller can significantly enhance control efficiency of the system and the federated model can provide sufficient virtual experience for the controller.
To enhance interpretability of the DQN agent, decision tree is used to extract explanation of the decision making process of the agent. State-action pairs generated by the agent is used train a decision tree. Post-hoc interpretation using a shallow but easily interpretable model enhances transparency and interpretability of reinforcement learning. Also, the result of classification made by the decision tree provides If-then rules which are reduced version of control strategies made by the artificial intelligence. The performance of the reduced rule-based control is also compared to the performance of DQN controller. It is demonstrated that the reduced rule is good-enough and the difference in energy savings between the two is marginal, resulting in 2.8%.
This study reports the development of explainable RL for cooling control of an existing office building. A decision tree is applied to trained DQN agent and then a set of reduced-order control rules are suggested. This study proposes rule reduction framework using explainable reinforcement learning and demonstrates that reduced rules can perform as well as complex reinforcement learning algorithms. The significance of this study lies in proposing how to derive rules with quantitative evaluation for building control.

Language: eng

URI: https://hdl.handle.net/10371/177736

https://dcollection.snu.ac.kr/common/orgView/000000167605

Files in This Item:

000000167605.pdf 2.34 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Architecture and Architectural Engineering (건축학과)
  - Theses (Master's Degree_건축학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share