Abstract
The integration of artificial intelligence (AI) and natural language processing (NLP) into language learning and assessment has unlocked new possibilities for accurately profiling English language learners (ELLs) and personalizing educational interventions. While previous studies have typically focused on isolated techniques either deep learning, traditional machine learning, or linguistic rule-based models there remains a critical need for comprehensive frameworks that combine the interpretability of rule-based reasoning with the predictive power of advanced AI. Addressing this gap, the present study introduces a novel hybrid methodology for ELL evaluation, integrating both rule mining through fuzzy logic and a state-of-the-art fusion model that integrates DeBERTa, metadata features, and LSTM architectures. This approach employs a hybrid DeBERTa + Metadata + LSTM (DBML) model, where DeBERTa serves as a transformer backbone to extract rich textual embeddings via attention mechanisms, Metadata features capture contextual, cognitive, and demographic learner traits, and LSTM layers are utilized for effective temporal modeling and dense integration. This comprehensive pipeline allows for complex prediction of language proficiency levels, dealing with both unstructured (text response) and structured (behavioral and demographic) data streams. Empirical comparisons against standard machine learning, deep learning, and standalone transformer models demonstrate the superiority of the proposed hybrid approach, achieving a peak accuracy of 93% significantly higher than benchmarked baselines. Furthermore, the study extensively investigates model reliability using statistical significance tests and eXplainable AI (XAI) techniques such as SHAP and DeepSHAP. These analyses not only confirm the model's robustness but also reveal the centrality of linguistic attributes (e.g., Syntax, Cohesion, Vocabulary) in classification, as further substantiated by comprehensive feature ranking including Information Gain, Gain Ratio, Gini Index, and permutation importance based on random forest algorithm for fuzzy rule extraction for top features.