Scalable scientific interest profiling using large language models

利用大型语言模型进行可扩展的科学兴趣分析

阅读:2

Abstract

OBJECTIVE: Research profiles highlight scientists' research focus, enabling talent discovery and fostering collaborations, but they are often outdated. Automated, scalable methods are urgently needed to keep these profiles current. METHODS: In this study, we design and evaluate two Large Language Models (LLMs)-based methods to generate scientific interest profiles-one summarizing researchers' PubMed abstracts and the other generating a summary using their publications' Medical Subject Headings (MeSH) terms-and compare these machine-generated profiles with researchers' self-summarized interests. We collected the titles, MeSH terms, and abstracts of PubMed publications for 595 faculty members affiliated with Columbia University Irving Medical Center (CUIMC), for 167 of whom we obtained human-written online research profiles. Subsequently, GPT-4o-mini, a state-of-the-art LLM, was prompted to summarize each researcher's interests. Both manual and automated evaluations were conducted to characterize the similarities and differences between the machine-generated and self-written research profiles. RESULTS: The similarity study showed low ROUGE-L, BLEU, and METEOR scores, reflecting little overlap between terminologies used in machine-generated and self-written profiles. BERTScore analysis revealed moderate semantic similarity between machine-generated and reference summaries (F1: 0.542 for MeSH-based, 0.555 for abstract-based), despite low lexical overlap. In validation, paraphrased summaries achieved a higher F1 of 0.851. A further comparison between the original and paraphrased manually written summaries indicates the limitations of such metrics. Kullback-Leibler (KL) Divergence of term frequency-inverse document frequency (TF-IDF) values (8.56 and 8.58 for profiles derived from MeSH terms and abstracts, respectively) suggests that machine-generated summaries employ different keywords than human-written summaries. Manual reviews further showed that 77.78% rated the overall impression of MeSH-based profiling as "good" or "excellent," with readability receiving favorable ratings in 93.44% of cases, though granularity and factual accuracy varied. Overall, panel reviews favored 67.86% of machine-generated profiles derived from MeSH terms over those derived from abstracts. CONCLUSION: LLMs promise to automate scientific interest profiling at scale. Profiles derived from MeSH terms have better readability than profiles derived from abstracts. Overall, machine-generated summaries differ from human-written ones in their choice of concepts, with the latter initiating more novel ideas.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。