An ensemble approach for research article classification: a case study in artificial intelligence.

阅读:4
作者:Lu Min, Tang Lie, Zhou Xianke
Text classification of research articles in emerging fields poses significant challenges due to their complex boundaries, interdisciplinary nature, and rapid evolution. Traditional methods, which rely on manually curated search terms and keyword matching, often lack recall due to the inherent incompleteness of keyword lists. In response to this limitation, this study introduces a deep learning-based ensemble approach that addresses the challenges of article classification in dynamic research areas, using the field of artificial intelligence (AI) as a case study. Our approach included using decision tree, sciBERT and regular expression matching on different fields of the articles, and a support vector machine (SVM) to merge the results from different models. We evaluated the effectiveness of our method on a manually labeled dataset, finding that our combined approach captured around 97% of AI-related articles in the web of science (WoS) corpus with a precision of 0.92. This presents a 0.15 increase in F1-score compared with existing search term based approach. Following this, we performed an ablation study to prove that each component in the ensemble model contributes to the overall performance, and that sciBERT outperforms other pre-trained BERT models in this case.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。