Abstract
Understanding DOM composition is essential for investigating the biogeochemical carbon cycle and related elements. However, methods for assigning molecular formulas (MF) in high-resolution mass spectrometry remain poorly defined, which often resulting in the misidentification of DOM components. In this study, we established a metrics evaluation framework for assessing the assignment results of each method based on similarity, accuracy and correctness. We selected six methods and then evaluated them using different settings of elemental limits, filter rules and selection rules. Our findings reveal that Formularity and TRFU are the most suitable methods for MF assignment in DOM. These two methods show high similarity ratios (93-99%) and low Bray-Curtis distances (0.13-0.14), indicating more substantial assignment capability. Their correctness rates (86-87%) and low chemical diversity errors (0.14-0.39) indicate more accurate assignment results. Other methods, TEnvR, ICBM and MFAssignR, with separate filters, show unassigned error rates of up to 47% ± 18%, potentially omitting certain DOM components. At moderate dissolved organic carbon concentrations, TRFU performs better, while Formularity outperforms at both high and low concentrations. This study provides recommendations for selecting analytical methods for DOM, facilitating a deeper understanding of its properties and role in aquatic ecosystems.