Manual Evaluation of Record Linkage Algorithm Performance in Four Real-World Datasets

手动评估记录链接算法在四个真实数据集上的性能

阅读:2

Abstract

OBJECTIVES:  Patient data are fragmented across multiple repositories, yielding suboptimal and costly care. Record linkage algorithms are widely accepted solutions for improving completeness of patient records. However, studies often fail to fully describe their linkage techniques. Further, while many frameworks evaluate record linkage methods, few focus on producing gold standard datasets. This highlights a need to assess these frameworks and their real-world performance. We use real-world datasets and expand upon previous frameworks to evaluate a consistent approach to the manual review of gold standard datasets and measure its impact on algorithm performance. METHODS:  We applied the framework, which includes elements for data description, reviewer training and adjudication, and software and reviewer descriptions, to four datasets. Record pairs were formed and between 15,000 and 16,500 records were randomly sampled from these pairs. After training, two reviewers determined match status for each record pair. If reviewers disagreed, a third reviewer was used for final adjudication. RESULTS:  Between the four datasets, the percent discordant rate ranged from 1.8 to 13.6%. While reviewers' discordance rate typically ranged between 1 and 5%, one exhibited a 59% discordance rate, showing the importance of the third reviewer. The original analysis was compared with three sensitivity analyses. The original analysis most often exhibited the highest predictive values compared with the sensitivity analyses. CONCLUSION:  Reviewers vary in their assessment of a gold standard, which can lead to variances in estimates for matching performance. Our analysis demonstrates how a multireviewer process can be applied to create gold standards, identify reviewer discrepancies, and evaluate algorithm performance.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。