Abstract
Traditional rule-based cluster tool scheduling in semiconductor manufacturing faces significant limitations, including inflexibility, reliance on domain-specific expertise, and suboptimal performance in dynamic and complex environments. These methods often struggle to adapt to varying process conditions and equipment configurations, which are common in modern fabrications (fabs). Furthermore, previous research has typically relied on simplified simulators of cluster tools, failing to capture the full complexity of real-world semiconductor manufacturing equipment. To address these challenges, this study examines the potential of deep reinforcement learning (DRL) for optimizing cluster tool scheduling. This research presents a comprehensive simulation environment that models a cluster tool system, including both vacuum (VTM) and atmospheric (ATM) robots. This study progressively evaluates DRL agents, starting with a single-agent deep Q-network (DQN) and advancing to a multi-agent DQN (MADQN) framework to schedule the combined VTM-ATM system. Experimental results demonstrate that the proposed DRL agents consistently outperform traditional rule-based methods in terms of productivity and adaptability. In the complex multi-agent environment, the MADQN agent demonstrated robust performance across all tested configurations, achieving a productivity improvement of up to 8.9% over standard rule-based scheduling methods. These findings highlight the potential of DRL to overcome the limitations of existing scheduling methods and significantly enhance productivity in semiconductor manufacturing.