Artificial Intelligence-Assisted Data Extraction With a Large Language Model: A Study Within Reviews

基于大型语言模型的AI辅助数据提取:一项综述研究

阅读:1

Abstract

BACKGROUND: Data extraction is a critical but error-prone and labor-intensive task in evidence synthesis. Unlike other artificial intelligence (AI) technologies, large language models (LLMs) do not require labeled training data for data extraction. OBJECTIVE: To compare an AI-assisted versus a traditional, human-only data extraction process. DESIGN: Study within reviews (SWAR) using a prospective, parallel-group comparison with blinded data adjudicators. SETTING: Workflow validation within 6 ongoing systematic reviews of interventions under real-world conditions. INTERVENTION: Initial data extraction using an LLM (Claude, versions 2.1, 3.0 Opus, and 3.5 Sonnet) verified by a human reviewer. MEASUREMENTS: Concordance, time on task, accuracy, sensitivity, positive predictive value, and error analysis. RESULTS: The 6 systematic reviews in the SWAR yielded 9341 data elements from 63 studies. Concordance between the 2 methods was 77.2% (95% CI, 76.3% to 78.0%). Compared with the reference standard, the AI-assisted approach had an accuracy of 91.0% (CI, 90.4% to 91.6%) and the human-only approach an accuracy of 89.0% (CI, 88.3% to 89.6%). Sensitivities were 89.4% (CI, 88.6% to 90.1%) and 86.5% (CI, 85.7% to 87.3%), respectively, with positive predictive values of 99.2% (CI, 99.0% to 99.4%) and 98.9% (CI, 98.6% to 99.1%). Incorrect data were extracted in 9.0% (CI, 8.4% to 9.6%) of AI-assisted cases and 11.0% (CI, 10.4% to 11.7%) of human-only cases, with corresponding proportions of major errors of 2.5% (CI, 2.2% to 2.8%) versus 2.7% (CI, 2.4% to 3.1%). Missed data items were the most frequent error type in both approaches. The AI-assisted method reduced data extraction time by a median of 41 minutes per study. LIMITATIONS: Assessing concordance and classifying errors required subjective judgment. Consistently tracking time on task was challenging. CONCLUSION: Data extraction assisted by AI may offer a viable, more efficient alternative to human-only methods. PRIMARY FUNDING SOURCE: Agency for Healthcare Research and Quality and RTI International.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。