Abstract
Sustainable food production and security depend on increasing agricultural productivity within existing arable land. This necessitates the effective translation of complex agronomic research into actionable, field-specific crop management recommendations. Despite substantial advances in agricultural research, a persistent knowledge-practice gap continues to impede the widespread adoption of evidence-based management practices. We evaluate whether large language models (LLMs) can bridge this gap by generating crop management recommendations from scientific literature. Using US soybean production as a case study, we developed a semi-automated, human-in-the-loop pipeline (hereafter called "our system") adhering to systematic review protocols. Our system demonstrated high accuracy for literature screening, outperforming standalone models. However, when generating a general soybean management plan, expert evaluations rated two commercial LLMs' output more favorably than the plan from our system. This work highlights the need to develop systems that address user trust and provide tailored, field-specific advice that is both trustworthy and practically useful for farming communities.