Abstract
Artificial Intelligence (AI) and automation are increasingly transforming the job market, necessitating advanced methods to enhance job opportunities and career satisfaction for students. In this context, data mining plays a crucial role by uncovering hidden patterns and relationships within large-scale educational and behavioral datasets, enabling more accurate and data-driven insights. This study investigates the use of data mining predictive to analyze career satisfaction based on students' academic achievements and behavioral traits. Specifically, we explore the efficacy of a transformer-based Bidirectional Encoder Representations from Transformers (BERT) model, which incorporates embedding layers and feed-forward networks within its multi-layer transformer blocks to capture complex, non-linear relationships among diverse educational and behavioral factors. For comparative purposes, traditional machine learning models, and deep learning architectures are also applied to the same Education & Career Success data set. For comparative purposes, traditional machine learning models such as support vector machines, logistic regression, and random forest, as well as a deep learning baseline using gated recurrent units, were also implemented on the same dataset. The empirical analysis demonstrates that the BERT model significantly outperforms these baseline methods, achieving a highest classification accuracy of 98%, compared to 80-85% for traditional and deep learning approaches. This superior performance highlights the proposed model's ability to effectively integrate and contextualize multifaceted input features, making it a powerful tool for predicting career satisfaction outcomes.