Thread Structure Learning on Online Health Forums with Partially Labeled Data

基于部分标注数据的在线健康论坛主题结构学习

阅读:1

Abstract

Thread structures, the reply relationships between posts, in online forums are very important for readers to understand the thread content, as well as for improving the effectiveness of automated forum information retrieval, expert findings, etc. However, most online forums only have partially labeled structures, which means that some reply relationships are known while the others are unknown. To address this problem, studies have been performed to learn and predict thread structures. However, existing work does not leverage the partially available thread structures to learn the complete thread structure. We have also observed that many online health forums are a type of person-centric forums, where persons are mentioned across posts, providing hints about the reply relationships between posts. In this paper, we first proposed to learn the complete thread structures by leveraging the partially known structures based on a statistical machine learning model: thread conditional random fields (threadCRF). Then we proposed to use person resolution, the process of identifying the same person mentioned in different contexts, together with threadCRF for thread structure learning. We have empirically verified the effectiveness of the proposed approaches.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。