Abstract
INTRODUCTION: Accurate monitoring of oxide compositions is critical for ensuring cement quality and performance in industrial production. Conventional analytical techniques for this purpose are often time-consuming, costly, and lack real-time capability. While Near-infrared (NIR) spectroscopy offers a rapid and non-destructive alternative, traditional chemometric models struggle to capture the highly nonlinear, high-dimensional spectral characteristics and exhibit limited interpretability. METHODS: To address these challenges, this paper proposes an interpretable TabNet-based multi-output regression method for predicting multiple oxide concentrations from NIR spectra. The proposed method integrates sparse feature selection with adaptive information aggregation, enabling it to dynamically prioritize the most informative spectral regions during processing. This architecture facilitates both automatic wavelength selection and accurate oxide content prediction. RESULTS: Extensive experiments on two cement datasets demonstrate that the proposed TabNet model consistently outperformed established baseline models in predictive accuracy. A key advantage of the TabNet framework is its enhanced interpretability, achieved by generating sequential attention masks that highlight chemically meaningful wavebands associated with each oxide component. DISCUSSION: This framework provides a scalable and insightful solution for spectral-based analysis, not only for cement quality monitoring but also for other materials science applications. The code is available at https:// github.com/Andrew-Leopard/CementOxidePredictor.