RoBuster-Corpus Annotated With Risk of Bias Text Spans in Randomized Controlled Trials in Physiotherapy and Rehabilitation: Corpus Development and Annotation Study

RoBuster-Corpus 标注了理疗和康复随机对照试验中存在偏倚风险的文本跨度：语料库开发和标注研究

阅读：2

作者：Dhrangadhariya,Anjani,Hilfiker,Roger,Sattelmayer,Karl Martin,Naderi,Nona,Giacomino,Katia,Caliesch,Rahel,Higgins,Julian,Marchand-Maillet,Stéphane,Müller,Henning

期刊：	JMIR Formative Research	影响因子：	2.100
时间：	2026	起止号：	2026 Apr 27;10:e55127
doi：	10.2196/55127

Abstract

BACKGROUND: Risk of bias (RoB) assessment of randomized clinical trials (RCTs) is vital to answering systematic review questions accurately. Manual RoB assessment for hundreds of RCTs is a cognitively demanding and lengthy process. Automation has the potential to assist reviewers in rapidly identifying text descriptions in RCTs that indicate potential risks of bias. However, no RoB text span annotated corpus could be used to fine-tune or evaluate large language models (LLMs), and there are no established guidelines for annotating the RoB spans in RCTs. OBJECTIVE: The revised Cochrane RoB 2 test (RoB 2) tool provides comprehensive guidelines for RoB assessment; however, due to the inherent subjectivity of this tool, it cannot be directly used as RoB annotation guidelines. The study aimed to develop precise RoB text span annotation instructions that could address this subjectivity and thus aid the corpus annotation. METHODS: We leveraged RoB 2 guidelines to develop visual instructional placards that serve as annotation guidelines for RoB spans and risk judgments. Expert annotators used these visual placards to annotate a dataset named RoBuster, consisting of 41 full-text RCTs from the domains of physiotherapy and rehabilitation. We report interannotator agreement (IAA) between 2 annotators for text span annotations before and after applying visual instructions on a subset (n=9) of RoBuster. We also provide IAA on bias risk judgments using Cohen κ. Moreover, we used a portion of RoBuster (n=10) to evaluate an LLM using a straightforward evaluation framework. This evaluation aimed to gauge the performance of an LLM (here GPT 3.5) in the challenging task of RoB span extraction and demonstrate the utility of this corpus using a straightforward framework. RESULTS: We present a corpus of 41 RCTs with fine-grained text span annotations comprising more than 28,427 tokens belonging to 22 RoB classes. The IAA at the text span level calculated using the F1 measure varies from 0% to 90%, while Cohen κ for risk judgments ranges between -0.235 and 1.0. Using visual instructions for annotation increases the IAA by more than 17 percentage points. LLM (GPT-3.5) shows promising but varied observed agreements with the expert annotation across the different bias questions. CONCLUSIONS: Despite having comprehensive bias assessment guidelines and visual instructional placards, RoB annotation remains a complex task. Using visual placards for bias assessment and annotation enhances IAA compared to cases where visual placards are absent; however, text annotation remains challenging for the subjective questions and the questions for which annotation data are unavailable in RCTs. Similarly, while GPT-3.5 demonstrates effectiveness, its accuracy diminishes with more subjective RoB questions and low information availability.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。

肿瘤免疫

炎症

T细胞

线粒体

凋亡

转录调控

巨噬细胞

自噬

传染病

氧化应激

肠道菌群

磷酸化

血管生成

囊泡

3D/类器官

单细胞

中性粒细胞

外泌体

DNA甲基化

miRNA

药物研究

铁死亡

细胞衰老

乙酰化

缺氧低氧

泛素化

树突状细胞

炎性小体

组蛋白修饰

肿瘤微环境

lncRNA

代谢重编程

焦亡

m6A/m5C/m7G

内质网应激

空间多组学

细胞基因治疗

治疗耐药

相分离

Treg

上皮间质转化

免疫代谢

染色质重塑

脂质过氧化

蛋白质稳态

脂代谢

细胞极性

铁代谢

氨基酸代谢

碱基编辑

cGAS-STING

肠脑轴

蛋白降解

乳酸化

翻译调控

circRNA

piRNA

肿瘤异质性

NK 细胞

氧化脂质

MDSC

NETosis

低氧缺氧

溶酶体功能

琥珀酰化

细胞干性

CAR-NK

冷应激

RNA 编辑

Tfh

巴豆酰化

器官芯片

表观遗传记忆

铜死亡

器官纤维化

线粒体未折叠蛋白反应

空间代谢组

程序性坏死

自噬流

MAIT 细胞

肠肝轴

丙酰化