CARDBiomedBench: a benchmark for evaluating the performance of large language models in biomedical research
CARDBiomedBench:用于评估大型语言模型在生物医学研究中性能的基准测试
期刊:Lancet Digital Health
影响因子:24.1
doi:10.1016/j.landig.2025.100943
Bianchi, Owen; Willey, Maya; Alvarado, Chelsea X; Danek, Benjamin; Khani, Marzieh; Kuznetsov, Nicole; Dadu, Anant; Shah, Syed; Koretsky, Mathew J; Makarious, Mary B; Weller, Cory; Levine, Kristin S; Kim, Sungwon; Jarreau, Paige; Vitale, Dan; Marsan, Elise; Iwaki, Hirotaka; Leonard, Hampton; Bandres-Ciga, Sara; Singleton, Andrew B; Nalls, Mike A; Mokhtari, Shekoofeh; Khashabi, Daniel; Faghri, Faraz