Abstract
Standardizing structural isomeric relationships and evaluating their distribution in chemical space remain major challenges in cheminformatics. Conventional molecular fingerprints and dimensionality reduction techniques are often sensitive to dataset size and structural complexity. Here, we introduce a molecular fingerprint, Structural Isomer Cumulative molecular fingerprint (SIC), that quantitatively captures relative structural differences among isomers with high precision. SIC consists of two variables: SIC(em), representing exact mass, and SIC(L), a cumulative descriptor derived from substructural differences. SIC(L) enables calculation of relative structural distances within isomeric groups regardless of dataset size or molecular complexity. Using SIC, we successfully quantified structural differences across positional, skeletal, and functional group isomers, which were not adequately captured by existing descriptors. Furthermore, a scatter plot of SIC(em) and SIC(L) visualized metabolite distributions among cellular compartments, and nine endogenous metabolites were identified whose structural characteristics suggest potential toxicity.