Abstract
The propensity score (PS) is widely used to control for large numbers of covariates in high-dimensional healthcare database studies. In these settings, the least absolute shrinkage and selection operator (LASSO) is commonly used to estimate the PS, with the regularization parameter typically chosen by cross-validation to optimize out-of-sample prediction. For PS weighting, however, theory and simulations have shown that prediction-based tuning can over-regularize the PS model and that less regularization ("undersmoothing") is needed to minimize bias in PS weighted estimators. The optimal degree of undersmoothing can be guided by the efficient influence function of the target causal parameter. In practice, however, the efficient influence function is often unknown or difficult to derive. We therefore investigate balance metrics as a simple, broadly applicable approach for selecting the degree of undersmoothing when the efficient influence function is unavailable. Because balance-based tuning does not guarantee minimal bias in PS weighted estimators-as such metrics are blind to the efficient influence function-we also propose a framework for generating synthetic negative control exposures for bias detection. We show that synthetic negative control exposures can identify analyses that likely violate partial exchangeability due to inadequate control for measured confounding. In numerical studies, balance-based undersmoothing consistently reduced bias relative to standard cross-validation, while synthetic negative control exposures effectively flagged analyses with residual confounding. Together, these approaches provide practical tools for improving PS weighting analyses when theory-driven tuning is infeasible.