Abstract
Recent advances in artificial intelligence have produced powerful predictive models for chemical analysis, but model interpretability remains a challenge. Here, we introduce Unsupervised Hierarchical Symbolic Regression (UHSR), providing an explainable solution while maintaining competitive predictive performance. With a focus on thin-layer chromatography (TLC), a crucial technique in molecular polarity analysis, UHSR automatically distills chemically-intuitive retention indices and discovers explainable equations that link molecular structures to chromatographic behavior. Experiments have shown UHSR's capability to derive concise and accurate governing equations linking polarity to molecular structures from the TLC dataset. A survey of 100 expert chemists demonstrates that our UHSR model gains more trust from chemists compared to traditional models. Additionally, we also show its adaptability to other property prediction tasks beyond molecular polarity.