Abstract
OBJECTIVE: To develop a transfer-learning Bayesian sparse logistic regression model that transfers information learned from one dataset to another by using an informed prior to facilitate model fitting in small-sample clinical patient-level prediction problems that suffer from a lack of available information. METHODS: We propose a Bayesian framework for prediction using logistic regression that aims to conduct transfer-learning on regression coefficient information from a larger dataset model (order 105-106 patients by 105 features) into a small-sample model (order 103 patients). Our approach imposes an informed, hierarchical prior on each regression coefficient defined as a discrete mixture of the Bayesian Bridge shrinkage prior and an informed normal distribution. Performance of the informed model is compared against traditional methods, primarily measured by area under the curve, calibration, bias, and sparsity using both simulations and a real-world problem. RESULTS: Across all experiments, transfer-learning outperformed the traditional L1-regularized model across discrimination, calibration, bias, and sparsity. In fact, even using only a continuous shrinkage prior without the informed prior increased model performance when compared to L1-regularization. CONCLUSION: Transfer-learning using informed priors can help fine-tune prediction models in small datasets suffering from a lack of information. One large benefit is in that the prior is not dependent on patient-level information, such that we can conduct transfer-learning without violating privacy. In future work, the model can be applied for learning between disparate databases, or similar lack-of-information cases such as rare outcome prediction.