Abstract
BACKGROUND: Multi-center Federated Learning (FL) has played a significant role in disease prediction, offering a feasible solution to the challenges of cross-institutional collaboration. However, the fairness issues inherent in traditional FL frameworks have limited their further development in the medical field. METHODS: We propose a Contribution Fairness Federated Learning model based on Multi-center Core Data Extraction (FedCMC). This model accurately assesses the actual contributions of each center from both data and model perspectives using two fairness indicators: data information richness and model quality. In the data contribution assessment phase, we innovatively design a Multi-center Core Data Extraction Module (MCDEM). This module extracts representative core datasets from the original training pool, effectively filtering redundant information and enhancing the fairness of data contribution assessment and the model's generalization ability. Subsequently, weighted aggregation based on each center's contribution optimizes the benefits for high-contribution centers, incentivizing more users to participate in federated learning. Finally, a personalized federated learning strategy is adopted, enabling the model to fine-tune through each center's core dataset, thereby improving its prediction relevance and accuracy. RESULTS: We analyze data from 902 endometrial cancer (EC) patients across four independent medical institutions. In centers A, B, and C, the FedCMC model achieves areas under the ROC curve (AUC) of 0.8261, 0.8750, and 0.8964, respectively. Comparative analysis with three traditional federated learning algorithms indicates that FedCMC offers significant advantages in both performance and fairness. CONCLUSION: FedCMC effectively alleviates fairness issues in traditional FL frameworks and accurately predicts the myometrial invasion (MI) status of EC patients, supporting personalized treatment strategies.