Quasi-continuous and discrete confidence rating scales for observer performance studies: Effects on ROC analysis

观察者表现研究中准连续和离散置信度评级量表:对ROC分析的影响

阅读:1

Abstract

RATIONALE AND OBJECTIVES: To examine the effects of the number of categories in the rating scale used in an observer experiment on the results of ROC analysis by a simulation study. MATERIALS AND METHODS: We have previously evaluated the effects of computer-aided diagnosis on radiologists' characterization of malignant and benign breast masses in serial mammograms. The evaluation of the likelihood of malignancy was performed on a quasi-continuous (0-100 points) confidence rating scale. In this study, we simulated the use of discrete confidence rating scales with fewer number of categories and analyzed the results with receiver operating characteristic (ROC) methodology. The observers' estimates of the likelihood of malignancy were also mapped to BI-RADS assessments with five and seven categories and ROC analysis was performed. The area under the ROC curve and the partial area index obtained from ROC analysis of the different confidence rating scales were compared. RESULTS: The fitted ROC curves and the performance indices do not change significantly when the confidence rating scales were varied from 6 to 101 points if the estimated operating points obtained directly from the data are distributed relatively evenly over the entire range of true-positive fraction (TPF) and false-positive fraction (FPF). The mapping of the likelihood of malignancy observer data to the seven-category BI-RADS assessment scale allowed reliable ROC analysis, whereas mapping to the five-category BI-RADS scale could cause erratic ROC curve fitting because of the lack of operating points in the mid-range or failure in ROC curve fitting because of data degeneration for some observers. CONCLUSION: ROC analysis of discrete confidence rating scales with few but relatively evenly distributed data points over the entire FPF and TPF range is comparable to that of a quasi-continuous rating scale. However, ROC analysis of discrete confidence rating scales with few and unevenly distributed data points may cause unreliable estimations.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。