Abstract
Bayesian network (BN) models are graphical structures called the directed acyclic graphs, which model the probabilistic dependencies between variables. In the context of predictive modeling, BN models can intuitively represent the collective contribution of factors in predicting an outcome variable. This study is aimed at developing a BN model in predicting the two-year mortality of patients diagnosed with squamous cell carcinoma of oral cavity (OCSCC). The secondary data for the study was obtained from a published cohort study conducted within the institute after ethical approval. The strength of association of the potential prognostic factors with the outcome variable was determined using multiple logistic regression. The hybrid BN model based on expert opinion and association matrix (BN-H) was developed. The conditional dependencies between the variables were incorporated as the thickness of edges between nodes. Multiple logistic regression (MLR) and BN models based on tree augmented (TAN), expectation maximization (EM), and gradient boosting (GB) method was developed for comparison. Gini coefficient, sensitivity, specificity, misclassification rate, and area under the ROC curve were estimated in both training and testing data for comparison. Age, smoking, alcohol, stage of the cancer, and treatment modality were found be significant prognostic factors for mortality. The association matrix determined that there were significant inter-dependencies between variables. The BN-H model was found to have a comparable predictive accuracy to the MLR model. Bayesian network model developed with expert opinion and appropriate association matrix can be an alternative to existing predictive models for binary outcome. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s13193-024-02164-w.