Abstract
Machine Learning methods are exploited to extract a universal approach for self-diffusion coefficient calculation in molecular fluids. Analytical expressions are derived through symbolic regression for fluids both in bulk and confined nanochannels. The symbolic regression framework is trained on simulation data from molecular dynamics and correlates the values of the self-diffusion coefficients with macroscopic properties, such as density, temperature, and the width of confinement. New expressions are derived for nine different molecular fluids, while an all-fluid universal equation is extracted to capture molecular behavior as well. In such a way, a highly computationally demanding property is predicted by easy-to-define macroscopic parameters, bypassing traditional numerical methods based on mean squared displacement and autocorrelation functions at the atomistic level. To achieve generalizability and interpretability, simple symbolic expressions are selected from a pool of genetic programming-derived equations. The obtained expressions present physical consistency, and they are discussed in terms of explainability. The accurate prediction of the self-diffusion coefficient both in bulk and confined systems is important for advancing the fundamental understanding of fluid behavior and leading the design of nanoscale confinement devices containing real molecular fluids.