Abstract
Artificial intelligence (AI) is trying to catch up with human beings in many aspects. In this track, the potential for replacing human decision-making with AI models, such as large language models (LLMs), has become a topic of considerable debate. To test the performance of AI in daily decision-making, we compared humans, LLMs, and reinforcement learning (RL) in a multi-day commute decision-making game. It denotes a collaborative decision-making process where individual and collective outcomes are interdependent. We examined various performance metrics, including overall system results, system convergence progress, individual decision dynamics, and individual decision mechanisms. We find that LLMs exhibit human-like abilities to learn from historical experience and achieve convergence when making daily commute decisions. However, in the context of multi-person collaboration, LLMs still face challenges, such as weak perception of others' choices, poor group decision-making mechanisms, and a lack of physical knowledge.