Abstract
Stereoselective catalysis is a paradigm of complexity, navigating arrays of specific, yet weak, noncovalent interactions to achieve asymmetric induction. Accordingly, structure-activity relationships modeling stereoselectivity have required bespoke representations, high dimensionality, and/or large data sets to furnish productive trends, hampering their subsequent interpretability and utility. Here, we report that active site-based buried volume is a uniquely advantageous descriptor for the construction of models to stereoselectivity by Brønsted acid organocatalysts. Through statistical analyses of 50+ data sets across nearly 200 distinct catalysts, we realize active site-based buried volume is an exceptional representation, accommodating both extensive structural diversity (within data sets) and functional diversity (across data sets). We show that the descriptor's value likely relates to its surprising capacity to generalize beyond simple steric interactions and account for diverse stereoelectronic effects consequential to reported stereoselectivity. As such, we anticipate future modeling of stereoselectivity by Brønsted acid is greatly simplified via the concise representation, active site-based buried volume.