Abstract
Computerized features derived from medical imaging have shown great potential in building machine learning models for predicting and prognosticating disease outcomes. However, the performance of such models depends on the robustness of extracted features to institutional and acquisition variability inherent in clinical imaging. To address this challenge, we propose Variability Regularized Feature Selection (VaRFS), a framework that integrates feature variability as a regularization term to identify features that are both discriminable between outcome groups and generalizable across imaging differences. VaRFS employs a novel sparse regularization strategy within the within the Least Absolute Shrinkage and Selection Operator (LASSO) framework, for which we analytically confirm convergence guarantees as well as present an accelerated proximal variant for computational efficiency. We evaluated VaRFS across five clinical applications using over 700 multi-institutional imaging datasets, including disease detection, treatment response characterization, and risk stratification. Compared to three conventional feature selection methods, VaRFS yielded consistently higher classifier AUCs in hold-out validation; balancing reproducibility, sparsity, and discriminability in medical imaging feature selection.