Abstract
Compounds possessing a small highest occupied molecular orbital-lowest unoccupied molecular orbital (HOMO-LUMO) gap (E (gap)) are highly desirable due to their instability and reactivity, making them useful for a wide range of applications. However, the search for new organic compounds with a low E (gap) is an expensive endeavor due to the exponentially increasing pool of virtual compounds. Accordingly, in this study, atomic Signatures were utilized as molecular descriptors to investigate the correlation between the molecular structure and the B3LYP-computed E (gap), thus aiding in the development of a quantitative structure-property relationship (QSPR). An easy-to-use robust model was constructed using forward-stepping multilinear regression with leave-one-out cross validation, resulting in a regression coefficient (r (2)) of 0.86 and a predictability (q (2)) of 0.76. The use of atomic Signatures as molecular descriptors successfully inferred correlations between different structural motifs and E (gap). The atomic fragments containing π-bonds in various aromatic compounds were found to be the most significant atomic Signatures, explaining nearly 50% of the variance in the data, with regression coefficients that decreased E (gap). This is attributed to π-electron delocalization, making this molecular fragment a reactive site in a molecule. Finally, an external test set was used to further evaluate the model's predictive performance. The developed QSPR can be utilized as a reliable initial screening tool to identify potential candidates possessing low E (gap) values.