Performance comparison of four published lung cancer prediction models applied to a cohort from the National Lung Screening Trial

对四种已发表的肺癌预测模型应用于国家肺癌筛查试验队列的性能比较

阅读:1

Abstract

BACKGROUND: Mathematical prediction models (MPMs) based on clinical and radiologist-assessed features have been developed to assist with lung cancer risk assessment for imaging-detected lung nodules. However, MPMs were developed using different datasets, thresholds, and feature sets, making it difficult to cross-compare the published performance metrics and determine prospective performance stability. The aim of this study is to utilize a large lung cancer screening cohort with identified pulmonary nodules to compare the performance of four MPMs, at a standardized sensitivity value, to reduce the false positive rate for lung cancer screening exams. METHODS: This retrospective study utilized low-dose computed tomography (LDCT) identified lung nodules from the National Lung Screening Trial (NLST) to evaluate four MPMs [Mayo Clinic (MC), Veterans Affairs (VA), Peking University (PU), and Brock University (BU)]. For cross-comparison, a small NLST sub-cohort (n=270) was used to determine a calibrated decision threshold for each model, targeting a sensitivity for detecting lung cancer of 95%. Performance was evaluated using area under the receiver-operating-characteristic curve (AUC-ROC), area under the precision-recall curve (AUC-PR), sensitivity, and specificity. The calibrated threshold applied to the remaining NLST cohort (n=1,083) was used to demonstrate the stability of performance metrics. RESULTS: A total of 1,353 patients [mean ± standard deviation (SD) age, 62.3±5.2 years; 746 male] were included, of which 122 (9.0%) had a malignant nodule. At the target sensitivity of 95%, the highest testing specificity (correctly identified benigns) was seen in the BU and MC models (55% and 52%, respectively), compared to the VA (45%) and the PU (16%). The AUC-ROCs for BU (83%), MC (83%), PU (76%), and VA (77%) suggest high-moderate performance, while AUC-PR more accurately reflects that all the models have sub-optimal precision (27-33%). CONCLUSIONS: Tuning calibration thresholds of existing MPM aids in performance comparison and stability for application in the lung cancer screening setting. However, targeting high sensitivity (95%), the achievable specificity of the MPMs is low (16-55%), which may limit clinical utility.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。