Opinion: Strategy of Semi-Automatically Annotating Full Text Corpus of Genomics & Informatics

观点:基因组学与信息学全文语料库半自动标注策略

阅读:1

Abstract

There is a community need for an annotated corpus consisting of the full texts of biomedical journal articles. In response to community needs, a prototype version of full text corpus of Genomics & Informatics, called GNI version 1.0 has been recently published, with 499 annotated full text articles available as a corpus resource. However, GNI needs to be updated, as the texts were shallow-parsed, and annotated with several existing parsers. I list issues associated with upgrading annotations, and give opinion on methodology to develop next version of GNI corpus based on a semi-automatic strategy for more linguistically rich corpus annotation.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。