Abstract
Breast cancer is one of the main causes of cancer deaths around the world and is known for its aggressive growth and ability to spread. While machine learning has shown good results for diagnosis, most existing methods do not handle uncertainty or explain their predictions clearly. In this study, we present an integrated framework that combines uncertainty-aware ensemble learning with causal feature analysis and multimodal explainability for breast cancer prediction. The framework uses a mix of Light Gradient Boosting Machine (LightGBM), random forest, and gradient boosting classifiers that include uncertainty estimation so that the model can mark predictions that are less confident. It also applies causal analysis to detect possible clinical confounders and uses SHAP (Shapley Additive Explanations), permutation importance, and feature attribution for interpretation. Tests on two public datasets showed strong and consistent performance. On the UCTH Clinical Dataset, the model reached an area under the curve (AUC) of 0.97%, an accuracy of 0.95%, and an F1 score of 0.94%, with 100% precision for high confidence cases and no false positives. On the Breast Cancer Wisconsin dataset, it achieved an AUC of 0.99, an accuracy of 0.94%, and an F1 score of 0.92%, which increased to 0.98% accuracy and 0.98% F1 score when only certain predictions were considered. Causal analysis pointed out important clinical confounders like lymph node involvement, tumor size, and metastasis, while fairness tests showed balanced results across demographic groups. Overall, the framework combines uncertainty estimation and causal interpretability to give predictions that are both accurate and trustworthy. It provides clinicians with clear confidence levels for every prediction and supports transparent decision-making that can reduce diagnostic errors and improve reliability in clinical use.