Abstract
Molecular dynamics (MD) simulations are essential for understanding biomolecular systems but remain challenging to automate. Recent advances in large language models (LLMs) have demonstrated success in automating complex scientific tasks using LLM-based agents. In this paper, we introduce MDCrow, an agentic LLM assistant capable of automating MD workflows for proteins. MDCrow uses chain-of-thought over 40 expert-designed tools for handling and processing files, setting up simulations, analyzing the simulation outputs, and retrieving relevant information from literature and databases. We assess MDCrow's performance across 25 common tasks of varying complexity, and we evaluate the agent's robustness to difficulty and prompt style. gpt-4o is able to complete increasingly complex tasks with low variance, followed closely by llama3-405b, a compelling open-source model. While prompt style does not influence the best models' performance, it has significant effects on smaller models.