Abstract
This study aims to explore the enterprise fission path optimization strategy based on the Soft Actor-Critic (SAC) algorithm and analyze its impact on the parent company and the overall operational efficiency. Firstly, the enterprise finance and marketing data provided by the National Bureau of Statistics public dataset are used for data pre-processing. Secondly, a multi-level reward function is designed that covers short-term financial and market indices. Meanwhile, it incorporates long-term indices that measure dynamic capabilities, such as innovation, market agility, and resource integration. Finally, by introducing the reinforcement learning algorithm of SAC, the enterprise fission scenario is constructed into a complicated decision environment, in which the state space includes the current financial situation, market performance, and dynamic capability level of the enterprise. The action space encompasses various strategic choices of enterprise fission to simulate the enterprise fission decision process. The SAC algorithm's entropy regularization feature prompts the model to strike a balance between exploration and utilization to optimize the dynamic capability construction. The experimental results show that the fission path optimized by deep reinforcement learning (DRL) markedly improves the resource allocation efficiency and market response speed by an average of 20.4% and 25.2%, respectively. More importantly, dynamic capability construction has been significantly enhanced, with the innovation capability index increasing by 15.4%, market agility improving by 12.3%, and resource integration capability also enhancing by 10.5%. This indicates that the strategy can help accelerate the formation of industrial clusters. Therefore, the SAC algorithm-based enterprise fission path optimization strategy constructed in this study can bring lasting competitive advantages to enterprises.