BACKGROUND: This paper presents a novel approach to the problem of hedge detection, which involves identifying so-called hedge cues for labeling sentences as certain or uncertain. This is the classification problem for Task 1 of the CoNLL-2010 Shared Task, which focuses on hedging in the biomedical domain. We here propose to view hedge detection as a simple disambiguation problem, restricted to words that have previously been observed as hedge cues. As the feature space for the classifier is still very large, we also perform experiments with dimensionality reduction using the method of random indexing. RESULTS: The SVM-based classifiers developed in this paper achieves the best published results so far for sentence-level uncertainty prediction on the CoNLL-2010 Shared Task test data. We also show that the technique of random indexing can be successfully applied for reducing the dimensionality of the original feature space by several orders of magnitude, without sacrificing classifier performance. CONCLUSIONS: This paper introduces a simplified approach to detecting speculation or uncertainty in text, focusing on the biomedical domain. Evaluated at the sentence-level, our SVM-based classifiers achieve the best published results so far. We also show that the feature space can be aggressively compressed using random indexing while still maintaining comparable classifier performance.
Predicting speculation: a simple disambiguation approach to hedge detection in biomedical literature.
阅读:4
作者:Velldal, Erik
| 期刊: | Journal of Biomedical Semantics | 影响因子: | 2.000 |
| 时间: | 2011 | 起止号: | 2011 Oct 6; 2 Suppl 5(Suppl 5):S7 |
| doi: | 10.1186/2041-1480-2-S5-S7 | ||
特别声明
1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。
2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。
3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。
4、投稿及合作请联系:info@biocloudy.com。
