Abstract
BACKGROUND: To explore the feasibility of a combined model integrating radiomics and clinical-semantic features for differentiating pulmonary tuberculosis (PTB) from non-tuberculous solid lung lesions on contrast-enhanced CT. METHODS: In this study, 900 patients enrolled before October 2016 were randomly partitioned into training and internal validation sets at a 3:1 ratio, while those recruited between October 2016 and October 2017 formed an independent temporal validation set. Clinical-semantic features were selected through univariate analysis followed by multivariate analysis, while predictive radiomics features were identified using analysis of variance, Spearman correlation analysis, least absolute shrinkage and selection operator regression. Binary logistic regression was then used to construct the clinical-semantic, radiomics, and combined models. Model performance was evaluated using average precision (AP) derived from the precision-recall curve, and differences between models were assessed using bootstrap resampling. Clinical utility was assessed using decision curve analysis. RESULTS: Following feature selection, two clinical-semantic and three radiomics features were incorporated into the combined model. This model achieved APs of 0.91, 0.85, and 0.62 in the training, internal validation, and temporal validation sets, respectively, outperforming the clinical-semantic model, which yielded APs of 0.64, 0.61, and 0.41 (p < 0.001, p < 0.001, and p = 0.006). The radiomics model also outperformed the clinical-semantic model across the three sets with APs of 0.88, 0.82, and 0.45. Decision curve analysis showed that the combined model can provide good net benefit across varying threshold probabilities. CONCLUSION: By integrating clinical-semantic and radiomics features, the combined model enables accurate differentiation between PTB and non-tuberculous solid lung lesions, potentially facilitating non-invasive diagnosis and personalized treatment planning.