Abstract
Clinicians face an ever-increasing volume of medical literature and are turning to large language models and "deep research" systems to retrieve, organize, and synthesize biomedical evidence. In our use of these tools, we have found them useful in producing coherent and comprehensive summaries and proposing testable hypotheses. However, the outputs of these models are prone to flattening evidentiary hierarchies, overgeneralizing across heterogeneous populations and comparators, and occasionally propagating hallucinated citations. These failure modes risk automation bias and erosion of transparency if introduced into clinical pathways without guardrails. Here, we propose a "clinician-in-the-loop" in which clinicians remain the gatekeepers for artificial intelligence-assisted synthesis. We outline three core duties for clinicians: (1) evidence weighting that privileges randomized trials, high-quality meta-analyses, and absolute risk communication; (2) contextual integration across pathophysiology, existing evidence, and patient populations; and (3) provenance and bias auditing through source verification, uncertainty reporting, and counter-summaries. We moreover explore how healthcare institutions, medical educators, policymakers, and publishers of medical literature can promote literacy and transparency regarding the use of "deep research" tools, including the implementation of reporting standards, provenance disclosures, and equity surveillance.