Abstract
BACKGROUND: Radiomic features are sensitive to changes in reconstruction methods and image acquisition, and thus assessing the influence of advanced reconstruction methods on radiomics features reliability is crucial. This study aimed to investigate the robustness of different positron emission tomography-computed tomography (PET/CT) radiomic features under different reconstruction settings among patients with lung cancer. METHODS: Sixty-two patients with lung lesions who underwent whole-body 18F-fluorodeoxyglucose ((18)F-FDG) PET/CT were retrospectively included, with lesion volumes of interests (VOIs) delineated on CT images in 58 patients. PET images were reconstructed with different reconstruction algorithms including traditional ordered subset expectation maximization (OSEM), Hyper Iterative, a Bayesian penalized-likelihood iterative reconstruction algorithm, and Hyper deep progressive reconstruction (DPR), a deep learning-based progressive image reconstruction method. In addition to varying the reconstruction algorithms, β parameters (for Hyper Iterative), and acquisition durations, we also analyzed nonoverlapping 30-s segments (0-30, 31-60, 61-90, and 91-120 s) reconstructed using OSEM to assess the impact of count statistics. Lesions were delineated based on CT images in 58 patients, and 106 radiomic features were extracted. The coefficient of variation (COV) and concordance correlation coefficient (CCC) combined with percentage difference (%Diff) were calculated for each feature across different parameters to evaluate their stability and reproducibility. RESULTS: Correlation analysis showed significantly positive correlations in COV values across the five analysis groups-ReconGroup (different algorithms), HI-Para (different β values in Hyper Iterative), OSEM-Time (OSEM varying acquisition times), HI-Time (Hyper Iterative varying acquisition times), and DPR-Time (DPR varying acquisition times)-indicating inherent stability of feature variability (all P values <0.001). A total of 27 robust features were identified with consistently low variability (COV ≤5%) and high reproducibility (CCC> 0.8 and -10< %Diff <10) across changes in algorithms, parameter settings, and acquisition times. These included intensity-based features such as standardized uptake value (SUV) mean, total lesion glycolysis (TLG), and several specific second-order metrics. Analysis of four nonoverlapping 30-s temporal segments revealed 10 features with COV ≤5% and 72 features with CCC >0.8 and 10< %Diff <10 across all six pairwise comparisons. No statistically significant differences were observed for the average COV between the three algorithms, while the CCC values differed significantly between the algorithms at each acquisition duration (all P values <0.001), with different algorithms demonstrating higher reproducibility under different short-duration conditions. CONCLUSIONS: These findings suggest that certain radiomic features exhibit high robustness across reconstruction algorithms, parameter settings, acquisition durations, and count statistics. Joint consideration of reconstruction, segmentation, and feature selection is essential to ensuring reproducibility in radiomics workflows. Certain features hold promise for robust tumor quantification in multicenter studies, longitudinal monitoring, and prospective trials, where imaging protocols may vary, making them strong candidates for standardized imaging biomarkers in routine clinical workflows and research settings.