Deep Reinforcement Learning Based Scheduler for Flexible Job Shops with Sequence Dependent Setups

박인범

서울대학교 중앙도서관

S-Space 소개

My S-Space

로그인이 필요합니다.

S-Space

Publications

Detailed Information

Deep Reinforcement Learning Based Scheduler for Flexible Job Shops with Sequence Dependent Setups : 순서 의존적 셋업이 있는 유연 잡 샵을 위한 심층 강화학습 기반 스케줄러

Cited 0 time in Web of Science Cited 0 time in Scopus

Export

Authors: 박인범

Advisor: 박종헌

Issue Date: 2020

Publisher: 서울대학교 대학원

Description: 학위논문(박사)--서울대학교 대학원 :공과대학 산업공학과,2020. 2. 박종헌.

Abstract: This thesis studies a flexible job shop scheduling problem (FJSP) with sequence dependent setups. FJSP becomes significantly complicated when there are several constraints related to re-entrant flows in manufacturing systems. At the same time, the scheduling problems need to be solved frequently to effectively manage the variabilities in production requirements, available machines, and initial setup status. Accordingly, scheduling decisions are usually required to be made on an hourly basis, making it challenging to obtain high quality schedules within the time limit for large-scale manufacturing systems.

To minimize the makespan of FJSP, three scheduling methods using deep reinforcement learning (DRL) are presented in this thesis. First, we suggest a decentralized scheduler (DS) in which each agent determines setup decisions in a decentralized manner and learns a policy by sharing a neural network among the agents to deal with the changes in the number of machines. Furthermore, novel definitions of state, action, and reward are proposed to address the variabilities in production requirements and initial setup status.

Second, we introduce a centralized scheduling approach in which an agent that determines actions given observations for jobs and machines. To reduce the complexity of state space inherent to the centralized learning, a novel definition of the state is developed by abstracting observations from an environment. Based on the centralized approach, we proposed two schedulers that select a rule-based method (CS-R) and a job-machine pair (CS-P) as an action, respectively. Specifically, CS-R employs the $Q$-network to learn a centralized policy and CS-P utilizes the actor-critic deep reinforcement learning method and Wolpertinger policy to approximate the continuous features of job-machine pairs.

To verify the robustness of the proposed method, neural networks (NNs) trained on small-scale scheduling problems are used to solve large-scale scheduling problems. Through extensive experiments on solving scheduling problems from real-world semiconductor packaging lines, we demonstrate that the proposed approach outperforms rule-based, meta-heuristic, and other RL methods in terms of the makespan while incurring shorter computation time than the meta-heuristics considered. Furthermore, the trained NN performs well in solving unseen real-world scale problems even under stochastic processing time, suggesting the viability of the proposed method for real-world manufacturing lines.
본 논문은 순서 의존적 셋업이 있는 유연 잡샵 스케줄링 문제를 연구한다. 이 문제는 제조 시스템에서 존재하는 재유입 공정과 같은 제약 조건이 있을 때 상당히 복잡해진다. 동시에, 생산 요구량, 사용 가능한 설비 및 초기 셋업 상태의 변동성에 효과적으로 대응하기 위해 스케줄링 문제를 자주 풀어야 한다. 변동성에 대응하기 위해 시간 단위로 스케줄링을 수행해야 하므로 대형 제조 시스템의 경우 시간제한 내에 고품질의 스케줄을 획득하기가 어렵다.

순서 의존적 셋업이 있는 유연 잡샵의 최대 완료시간 최소화를 위해 심층 강화학습을 이용한 세 가지 셋업 스케줄링 기법을 제시하였다. 첫째, 각 에이전트가 분산형 방식으로 셋업을 의사결정하고, 설비 수의 변화에 대처하기 위해 에이전트와 신경망을 공유함으로써 정책을 배우는 분산형 스케줄러를 제안한다. 또한 생산 요건 및 초기 설정 상태의 가변성을 다루기 위해 상태, 행동 및 보상에 대한 새로운 정의를 제안한다.

둘째로, 모든 작업과 설비에 대한 관찰을 토대로 의사결정하는 방식인 중앙집중화된 스케줄링 접근법을 도입한다. 중앙집중식 학습시 발생하는 상태 공간의 복잡성을 줄이기 위해, 관찰된 정보를 추상화함으로써 상태에 대한 새로운 정의를 제안한다. 이를 토대로, 규칙 기반 스케줄러와 작업-설비 쌍을 하나의 행동으로 선택하는 스케줄러를 제안한다. 전자는 심층 큐 신경망을 토대로 중앙 집중식 정책을 배우고, 후자는 행위자-비평가 심층 강화 학습 방법과 울퍼팅어 정책을 활용하여 작업-설비 쌍의 연속적인 특징을 근사하게 만든다.

제안된 방법의 강건성을 검증하기 위해, 소규모의 스케줄링 문제들에 대해 훈련된 신경망을 사용하여 대규모 스케줄링 문제들을 해결한다. 현실 반도체 패키징 라인 스케줄링 문제에 대한 실험을 통해, 제안된 기법들이 규칙 기반, 메타 휴리스틱 및 다른 강화학습 기법을 능가하는 동시에 고려된 메타 휴리스틱보다 짧은 계산 시간을 요구한다는 것을 보여준다. 또한, 훈련된 신경망은 확률적 생산 시간이 있음에도 학습하지 않은 실제 규모 문제를 해결하는 데 좋은 성능을 보이며, 이는 실제 반도체 패키징 라인에 제안된 방법을 적용할 수 있음을 보여준다.

Language: eng

URI: https://hdl.handle.net/10371/167583

http://dcollection.snu.ac.kr/common/orgView/000000158795

Files in This Item:

000000158795.pdf 3.66 MB

Appears in Collections:

College of Engineering/Engineering Practice School (공과대학/대학원)
- Dept. of Industrial Engineering (산업공학과)
  - Theses (Ph.D. / Sc.D._산업공학과)

Altmetrics

Item View & Download Count

Show Full Item Record

Find it @ SNU

트윗하기

SNS Share