Abstract
BACKGROUND: Quality control (QC) is essential for ensuring the diagnostic reliability of Single-Photon Emission Computed Tomography (SPECT) systems. However, the reliance on third-party software for analyzing QC metrics introduces a potential source of variability that is not yet standardized. Variability in QC results due to the use of different image analysis software may compromise both equipment evaluation and interinstitutional comparability. PURPOSE: This technical note assessed the variability in QC test results generated by different SPECT image analysis software packages to underscore the need for improved standardization. METHODS: Five representative commercial SPECT QC software packages (A-E) were used to analyze identical DICOM image sets acquired from four SPECT/CT systems in accordance with the WS 523-2019 standard. Evaluated metrics included file reading success rates, key performance indicators (intrinsic uniformity, resolution, linearity), and compliance rates. Statistical analysis employed ANOVA or Welch's tests, followed by LSD post hoc testing, with effect sizes (η(2)) reported. RESULTS: File reading success varied significantly (61.8%-100%), with Softwares B and D exhibiting higher failure rates. Compliance rates for identical devices varied considerably (68.8%-100%). Statistically significant intersoftware differences were found for intrinsic integral uniformity (F = 10.17, p < 0.05, η(2 )= 0.092), intrinsic spatial resolution (Welch F = 79.7, p < 0.05, η(2 )= 0.477), and intrinsic differential linearity (F = 2.65, p < 0.05, η(2 )= 0.137). The effect sizes for spatial resolution and differential linearity indicated large effects (η(2) > 0.14). Significant differences (p < 0.05) in key indicators were also observed across analyses for UFOV/CFOV fields of view and X/Y directions. Pairwise comparisons indicated that the primary differences existed between Softwares B, D, and E compared to the other packages. CONCLUSION: We found significant disparities between software packages in both file reading capability and the analysis of key QC performance indicators. These differences directly impact the accuracy of equipment performance evaluations and interinstitutional comparability, potentially leading to divergent conclusions regarding the same device's compliance status.