Abstract
Wastewater surveillance is an emerging strategy that enables monitoring of the presence and dynamic changes of targeted substances, facilitating improved allocation of preventive actions and public health interventions. This paper investigates the application of machine learning models for classifying wastewater samples based on varying concentrations of C-Reactive Protein (CRP), a critical biomarker for inflammation, whose levels may rise due to the presence of certain drugs. Using absorption spectroscopy spectra, classification tasks were conducted to distinguish between five concentration classes ranging from zero to [Formula: see text]g/ml. Rather than relying on a single model, this study evaluates and compares multiple machine learning algorithms to determine the most effective approach for this classification task. Additionally, performance metrics including accuracy, precision, recall, F1 score, and specificity were calculated for each model. The comparative analysis revealed accuracies ranging from 64.88% to 65.48% for the best model, Cubic Support Vector Machine (CSVM), using both full-spectrum and restricted-range spectral data. Confusion matrices and Receiver Operating Characteristic (ROC) curves are presented to visually interpret classification performance. The results highlight the potential of machine learning techniques to moderately classify CRP levels in wastewater, offering promising insights for future biosensor development and real-time environmental monitoring.