Turtling: a time-aware neural topic model on NIH grant data

Turtling:一种基于NIH拨款数据的时间感知神经主题模型

阅读:1

Abstract

MOTIVATION: Recent initiatives for federal grant transparency allow direct knowledge extraction from large volumes of grant texts, serving as a powerful alternative to traditional surveys. However, its computational modeling is challenging as grants are usually multifaceted with constantly evolving topics. RESULTS: We propose Turtling, a time-aware neural topic model with three unique characteristics. First, Turtling employs pretrained biomedical word embedding to extract research topics. Second, it leverages a probabilistic time-series model to allow smooth and coherent topic evolution. Lastly, Turtling leverages additional topic diversity loss and funding institute classification loss to improve topic quality and facilitate funding institute prediction. We apply Turtling on publicly available NIH grant text and show that it significantly outperforms other methods on topic quality metrics. We also demonstrate that Turtling can provide insights into research topic evolution by detecting topic trends across decades. In summary, Turtling may be a valuable tool for grant text analysis. AVAILABILITY AND IMPLEMENTATION: Turtling is freely available as an open-source software at https://github.com/aicb-ZhangLabs/Turtling.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。