日期:
2020 年 — 2026 年
2020
2021
2022
2023
2024
2025
2026
影响因子:

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

系统性基准测试表明,大型语言模型尚未达到传统罕见病决策支持工具的诊断准确率。

Reese, Justin T; Chimirri, Leonardo; Bridges, Yasemin; Danis, Daniel; Caufield, J Harry; Gargano, Michael A; Kroll, Carlo; Schmeder, Andrew; Liu, Fengchen; Wissink, Kyran; McMurry, Julie A; Graefe, Adam S L; Niyonkuru, Enock; Korn, Daniel R; Casiraghi, Elena; Valentini, Giorgio; Jacobsen, Julius O B; Haendel, Melissa; Smedley, Damian; Mungall, Christopher J; Robinson, Peter N

Systematic benchmarking demonstrates large language models have not reached the diagnostic accuracy of traditional rare-disease decision support tools

系统性基准测试表明,大型语言模型尚未达到传统罕见病决策支持工具的诊断准确率。

Reese, Justin T; Chimirri, Leonardo; Bridges, Yasemin; Danis, Daniel; Caufield, J Harry; Gargano, Michael A; Kroll, Carlo; Schmeder, Andrew; Liu, Fengchen; Wissink, Kyran; McMurry, Julie A; Graefe, Adam Sl; Niyonkuru, Enock; Korn, Daniel R; Casiraghi, Elena; Valentini, Giorgio; Jacobsen, Julius Ob; Haendel, Melissa; Smedley, Damian; Mungall, Christopher J; Robinson, Peter N