Progress and trends on machine learning in proteomics during 1997-2024: a bibliometric analysis

1997-2024年蛋白质组学机器学习的进展与趋势:文献计量分析

阅读:4

Abstract

OBJECTIVE: Despite growing interest in the application of machine learning (ML) in proteomics, a comprehensive and systematic mapping of this research domain has been lacking. This study addresses this gap by conducting the first large-scale bibliometric analysis focused exclusively on ML-driven proteomics, aiming to elucidate its knowledge structure, development trajectory, and emerging research trends. METHODS: A total of 5,156 publications from the Web of Science Core Collection (1997-2024) were retrieved and analyzed. Bibliometric tools including CiteSpace 6.4.R1, VOSviewer 1.6.18, Scimago Graphica, and the R package bibliometrix were used to extract and visualize key bibliometric indicators. After data cleaning and de-duplication, analyses were conducted on keyword co-occurrence, citation networks, leading journals, influential authors, and institutional collaboration patterns to construct a comprehensive landscape of ML applications in proteomics. RESULTS: The number of publications has grown exponentially since 2010, with an average annual growth rate of 12.53% and a notable surge of 65.14% occurring between 2019 and 2020. The United States emerged as the most productive country, while the Chinese Academy of Sciences led among institutions. AlphaFold2-related research received the highest citations, reflecting the transformative role of deep learning in protein structure prediction. Thematic clustering revealed key research foci, including deep learning algorithms, protein-protein interaction prediction, and integrative multi-omics analysis. The field is characterized by strong interdisciplinary convergence, involving computer science, molecular biology, and clinical research. High-impact journals and influential authors were also identified, providing benchmarks for academic influence and collaboration. CONCLUSION: This study offers the first comprehensive bibliometric analysis of ML in proteomics, revealing key themes such as deep learning, pretrained models, and multi-omics integration. Future efforts should focus on building interpretable models, enhancing cross-disciplinary collaboration, and ensuring secure, standardized data use to advance precision medicine. SYSTEMATIC REVIEW REGISTRATION: https://doi.org/10.17605/OSF.IO/F4WUG.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。