Abstract
BACKGROUND: Federated Learning (FL) offers a privacy-preserving solution for multi-party data collaboration in smart healthcare. However, the data heterogeneity among hospitals and among patients often results in suboptimal performance for some hospitals when applying a global FL model. Current clustering-based FL methods struggle to adapt to complex and diverse data distributions, negatively impacting model performance. METHODS: We propose a novel framework, Federated Gaussian Mixture Clustering (FedGMC), which leverages Gaussian Mixture Clustering to train personalized FL models. FedGMC determines the optimal number of clusters prior to the FL process, reducing the time and computational cost associated with traversing multiple clustering configurations in existing approaches. RESULTS: The FedGMC framework was evaluated using real-world eICU datasets with various classifiers and performance metrics. Experimental results show that FedGMC outperforms other baseline methods in terms of the overall performance of combining two classifiers and two performance metrics. Moreover, it mitigates the risk of performance degraded for participating hospitals following FL. CONCLUSIONS: The FedGMC framework effectively addresses clinical heterogeneity, enhancing predictive performance and ensuring fairness among participating medical institutions. These improvements increase the willingness of data owners to engage in the collaboration FL initiatives.