Abstract
The proliferation of AI-generated text, fueled by large language models (LLMs), presents pressing challenges in maintaining content authenticity, safeguarding academic integrity, and mitigating misinformation. This paper proposes a responsible detection and mitigation framework that leverages hybrid neural networks and multi-feature fusion to distinguish AI-generated text from human-authored content. The proposed model integrates BERT-based semantic embeddings, convolutional features via Text-CNN, and statistical descriptors into a unified representation. A CNN-BiLSTM architecture is employed to capture both local syntactic patterns and long-range semantic dependencies. The framework emphasizes responsible AI (RAI) by prioritizing interpretability and reducing bias in detection decisions. Extensive evaluations on a balanced benchmark dataset demonstrate the model's superior performance, achieving 95.4% accuracy, 94.8% precision, 94.1% recall, and a 96.7% F1-score-outperforming leading transformer-based baselines. The proposed framework is also evaluated on the CoAID external independent dataset to show generalizability. This study contributes to the responsible deployment of LLMs by enhancing transparency and robustness in AI-generated content verification, paving the way for secure and ethical integration of generative models into content management systems.