Abstract
The rational design of polyimides (PIs) with targeted glass transition temperature (T(g)) is crucial for advanced microelectronics applications. While data-driven approaches offer promise, there is a pressing need for models that are not only predictive but also physically interpretable, especially with limited datasets. Herein, we present a highly interpretable Quantitative Structure-Property Relationship (QSPR) model for accurate T(g) prediction of PIs. Employing a Genetic Algorithm combined with Multiple Linear Regression (GA-MLR), we identified an optimal set of seven molecular descriptors from a curated dataset. The model demonstrates robust predictive performance and strong generalization ability, validated through rigorous statistical tests. Crucially, we provide a deep physicochemical interpretation of the descriptors, unifying their influence under the framework of free volume theory. We show that key descriptors govern T(g) by modulating the fractional free volume through distinct mechanisms: descriptors like Chi0n increase free volume by introducing molecular branching that disrupts chain packing, while MinPartialCharge influences T(g) through its effect on intermolecular interactions. This mechanistic understanding is translated into clear molecular design guidelines, distinguishing strategies for achieving high-T(g) versus processable, low-T(g) polymers. Our work establishes a reliable and transparent computational tool that bridges data-driven prediction with fundamental chemical insight for accelerating PIs development.