Abstract
Combinatorial expression libraries to optimize multigene pathways can improve product titers, but the large number of potential genetic variants makes exhaustive testing impractical. Statistical Design of Experiments (DoE) offers a powerful alternative to enable efficient exploration of gene expression landscapes with a limited number of measurements. Here, we applied this approach to modulate expression levels across all genes in the shikimate and para-aminobenzoic acid (pABA) biosynthesis pathways in Pseudomonas putida. From a theoretical library of 512 strain variants, we trained a regression model using a statistically structured sample comprising 2.7% of the total library, as defined by our DoE approach, and used the model to predict new genotypes with improved pABA titers. This strategy enabled us to achieve product titers ranging from 2 to 186.2 mg/L in the initial screen and subsequently guide a second round of strain engineering, culminating in a maximum titer of 232.1 mg/L. Our analysis indicated that aroB, encoding 3-dehydroquinate synthase, is a critical bottleneck in pABA biosynthesis. This study highlights the utility of combining DoE with linear regression modeling to systematically optimize complex metabolic pathways, paving the way for more efficient microbial production.