Comparing large language models and search engine responses to common orthodontic questions

比较大型语言模型和搜索引擎对常见正畸问题的回答

阅读:1

Abstract

BACKGROUND: Large Language Models (LLMs) highlight their potential in supporting patient education and self-management. Their performance in responses to orthodontic questions has yet to be explored. OBJECTIVES: This study aims to compare the quality, empathy, readability, and satisfaction of responses from LLMs and search engines on common orthodontic questions. METHODS: Forty-five common orthodontic questions (six categories) and a prompt were developed, and a self-designed multidimensional evaluation questionnaire was constructed. Questions were presented to 5 LLMs and 3 search engines on December,22,2024. The primary outcomes were the median expert-rated scores of LLMs versus search engine responses on quality, empathy, readability, and satisfaction, using 5- or 10-point Likert scales. RESULTS: LLMs scored significantly higher than search engines in quality (4.00 vs. 3.50, p < 0.001), empathy (3.75 vs. 3.50, p < 0.001), readability (4.00 vs. 3.75, p < 0.001), and satisfaction (8.00 vs. 7.25, p < 0.001). LLM-generated responses were rated significantly higher than those from search engines in therapeutic outcomes category, appliance selection category and cost category. CONCLUSIONS: In this cross-sectional study, the LLMs, particularly GPT-4o, outperformed search engines. These results indicate the potential of LLMs as supplementary tools for orthodontic patient education and self-management.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。