[Application of Explainable Deep Learning in Differentiating Benign from Malignant 
Pulmonary Space-occupying Lesions and Classifying Pathological Subtypes of Lung Cancer]

【可解释深度学习在区分良恶性肺部占位性病变和肺癌病理亚型分类中的应用】

阅读:1

Abstract

BACKGROUND: The discrimination between benign and malignant pulmonary space-occupying lesions and the classification of pathological subtypes of lung cancer are critical for clinical decision-making. However, conventional methods often suffer from insufficient utilization of multi-source clinical data and poor interpretability of deep learning models. This study investigates the performance of interpretable deep learning algorithms in diagnosing benign versus malignant pulmonary space-occupying lesions and classifying pathological subtypes of lung cancer, using a hybrid architecture based on Tab-Transformer-designed for tabular data and Residual Multi-Layer Perceptron (ResMLP), referred to as TT-ResMLP. METHODS: Data including radiological characteristics, medical history, and laboratory findings from 345 patients with pathologically confirmed pulmonary space-occupying lesions were collected. The dataset was randomly split into a development set and a test set at an 8:2 ratio. Stable features were selected using the Spearman correlation test and the Least Absolute Shrinkage and Selection Operator (LASSO). The Synthetic Minority Over-sampling Technique (SMOTE) was employed to balance the samples, and 10-fold cross-validation was used to enhance model generalizability. Models were constructed using the Tab-Transformer algorithm, the ResMLP algorithm, and the TT-ResMLP hybrid. Model performance was evaluated using receiver operating characteristic (ROC) curves, the area under the curve (AUC), accuracy, specificity, sensitivity, and micro-averaged ROC (micro-ROC). SHapley Additive exPlanations (SHAP) analysis was performed based on the optimal model. RESULTS: In the benign vs malignant diagnosis task, all three models performed well. The Tab-Transformer model demonstrated the best performance on the test set, followed by TT-ResMLP and ResMLP. SHAP analysis of the top-performing Tab-Transformer model revealed that the feature importance ranking was: age, pleural indentation, thrombin time, mean density, and ground-glass opacity. Pleural indentation contributed substantially to malignant diagnosis, and its contribution was further enhanced with increasing age and decreasing thrombin time. In the lung cancer subtype classification task, all three models exhibited excellent performance, with the TT-ResMLP hybrid showing the best overall performance. SHAP analysis further revealed that the Lung Imaging Reporting and Data System (Lung-RADS) category held high importance across all three pathological subtypes. Male gender was positively associated with the prediction of squamous cell carcinoma. Neuron-specific enolase (NSE) played a significant role in predicting small cell carcinoma. For adenocarcinoma, the diagnostic probability was positively correlated with the Lung-RADS category, a relationship more pronounced at lower prothrombin time (PT) values. In contrast, a negative correlation was observed in the squamous cell carcinoma and small cell carcinoma subgroups, although gender and NSE levels could enhance their contributory risk prediction. Analysis of feature decision boundaries indicated that the Lung-RADS grade possessed high discriminative power for identifying adenocarcinoma, whereas NSE demonstrated stronger discriminative ability for identifying small cell carcinoma. CONCLUSIONS: The TT-ResMLP hybrid architecture is effective for diagnosing the benign or malignant nature of pulmonary space-occupying lesions and classifying pathological subtypes of lung cancer. The model possesses good interpretability, aiding in the identification of key predictive features and unravelling their interactive mechanisms, thereby providing an effective tool for a deeper understanding of lung cancer biology and clinical decision support.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。