Promoting Responsible DeepSeek Deployment in Health Care: Scoping Review Comparing Grey and White Literature

促进医疗保健领域负责任的DeepSeek部署:灰色文献与白色文献比较的范围界定综述

阅读:1

Abstract

BACKGROUND: DeepSeek is an open-source large language model (LLM), and it has greatly accelerated LLM adoption in health care. Its rapid deployment has sparked concerns regarding its impact on patient outcomes and safety. However, little is known about how DeepSeek is used and regulated in health care. OBJECTIVE: This study aimed to (1) systematically review the characteristics of DeepSeek deployed in the top 100 hospitals in China, and (2) compare the performance and risks of DeepSeek between hospital disclosures and research evidence. METHODS: We searched the official websites and WeChat accounts of the top 100 hospitals in China and the databases of Web of Science and PubMed, using the terms "DeepSeek" and "large language models." Searches were limited to records after January 15, 2025, when DeepSeek was first released. All searches were conducted on May 20, 2025, with an update on June 28, 2025. We extracted the basic characteristics of DeepSeek; its aims, evaluation approach, performance, and risks; and hospital regulations. A coding framework was developed covering the application scenarios, evaluation dimensions, and risk sources of LLMs. The risk of bias was assessed using the Joanna Briggs Institute checklist. RESULTS: We identified a total of 58 DeepSeek models in 48 out of the top 100 Chinese hospitals and found 27 studies in the literature. The first hospital deployment of DeepSeek was recorded on February 10, 2025, and deployment rapidly expanded to 37 hospitals within a month. Concurrently, most related research studies (20/27, 74%) were published after May 2025. Among deployments and studies that reported version information, DeepSeek-reasoner (R1) was the most frequently used model, and private deployment was the predominant approach. DeepSeek was mainly used to assist in clinical decision-making, including patient diagnosis and treatment recommendation. Among hospital disclosures, only 36% (21/58) clearly indicated a predeployment assessment, 22% (13/58) presented assessment results, and 9% (5/58) identified potential risks and countermeasures. We found poor transparency in hospital reporting, with none of the disclosures presenting evaluation details. Hospitals were more likely to report higher performance and fewer risks for DeepSeek. CONCLUSIONS: This is one of the first scoping reviews to reveal the rapid, widespread deployment of DeepSeek in China's leading hospitals, primarily for clinical decision support. The deployment of DeepSeek in China's leading hospitals poses potential risks to patient outcomes and safety. We highlight the urgent need for existing regulations to be expanded to downstream developers and users to promote the responsible use of LLMs in health care. Hospitals need to use a more rigorous validation process and adopt a more transparent reporting policy. The main limitations of this review include the restriction to top-tier hospitals and the inherent constraints of gray literature. These factors should be considered when interpreting the findings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。