Evaluating generative AI responses to real-world drug-related questions

评估生成式人工智能对现实世界药物相关问题的回答

阅读:1

Abstract

Generative Artificial Intelligence (AI) systems such as OpenAI's ChatGPT, capable of an unprecedented ability to generate human-like text and converse in real time, hold potential for large-scale deployment in clinical settings such as substance use treatment. Treatment for substance use disorders (SUDs) is particularly high stakes, requiring evidence-based clinical treatment, mental health expertise, and peer support. Thus, promises of AI systems addressing deficient healthcare resources and structural bias are relevant within this domain, especially in an anonymous setting. This study explores the effectiveness of generative AI in answering real-world substance use and recovery questions. We collect questions from online recovery forums, use ChatGPT and Meta's LLaMA-2 for responses, and have SUD clinicians rate these AI responses. While clinicians rated the AI-generated responses as high quality, we discovered instances of dangerous disinformation, including disregard for suicidal ideation, incorrect emergency helplines, and endorsement of home detox. Moreover, the AI systems produced inconsistent advice depending on question phrasing. These findings indicate a risky mix of seemingly high-quality, accurate responses upon initial inspection that contain inaccurate and potentially deadly medical advice. Consequently, while generative AI shows promise, its real-world application in sensitive healthcare domains necessitates further safeguards and clinical validation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。