Safety-focused development and evaluation of an LLM sexual well-being chatbot for women: A methods-focused feasibility study

以安全为导向的女性性健康聊天机器人LLM开发与评估:一项以方法为重点的可行性研究

阅读:1

Abstract

BACKGROUND: Large language models (LLMs) offer opportunities for sexual health education, but integrating them into digital products presents clinical challenges and risks, particularly regarding clinical safety and monitoring at scale. This study designed and evaluated a safety-focused framework for an LLM-based sexual well-being chatbot in a mobile application. METHODS: We conducted a methods-focused feasibility study comprising a multi-stage development and evaluation of the chatbot. We used an interdisciplinary, medically led three-phase development process, including a five-stage evaluation framework combining synthetic test cases, clinician-led vulnerability testing and controlled release to real users to assess clinical accuracy and safety. RESULTS: The chatbot met predefined precision and recall thresholds in synthetic testing. Clinically inaccurate responses remained below 2% across clinician review stages, with no high-severity unsafe responses. In a controlled release, 5195 real user interactions were reviewed. Clinically inaccurate responses occurred in 0.90% (47/5195) of dialogues, with unsafe responses within severity thresholds. CONCLUSION: This study demonstrates the feasibility of a structured framework for developing and evaluating LLM-based sexual health chatbots with clinical safety oversight. This approach helps to address gaps in safety reporting and could be adapted for other sensitive clinical domains.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。