Abstract
Action selection in cluttered environments, where individuals must simultaneously pursue goals and avoid obstacles, presents a significant challenge for the brain. To understand the underlying mechanisms of action selection in such contexts, we propose a computational model that extends stochastic optimal control theory through a novel framework for obstacle avoidance. The model decomposes action selection as a weighted combination of individual control policies, each generated for either target approach or obstacle avoidance. By integrating value information from goals, obstacles, and actions into a unified measure of "relative desirability", the model dynamically determines the contribution of each policy to the overall action selection process. We evaluated the framework using simulated target-reaching tasks in cluttered environments based on previous human studies. The results showed that the model captures key features of human motor behavior, including the influence of obstacle properties on movement trajectories and the transient tendency to initiate movements toward obstacles before avoidance. This work offers new insights into the dynamic interaction between approach and avoidance behaviors, providing a comprehensive framework for understanding action selection in complex and naturalistic settings.