Abstract
SUMMARY: Designing mRNA coding sequences (CDSs) for vaccine development requires co-optimizing secondary structure stability and codon usage, which are typically measured by minimum free energy (MFE) and codon adaptation index (CAI), respectively. To address this challenge, we previously employed dynamic programming and beam search techniques to develop LinearCDSfold, a tool that generates a single CDS encoding a given protein sequence by jointly optimizing MFE and CAI. It produces an exact solution with cubic-time complexity and a high-quality approximation in linear time, both with respect to the CDS length. Since reducing MFE and increasing CAI often conflict during CDS design, it is desirable to automatically generate Pareto-optimal CDSs, for which no alternative simultaneously improves both objectives. To our knowledge, DERNA is the only existing tool with this functionality. In this work, we enhance the capabilities of LinearCDSfold to automatically and efficiently generate a set of Pareto-optimal CDSs. Experiments conducted on nine protein sequences show that LinearCDSfold performs comparably to DERNA in generating Pareto-optimal CDSs while achieving substantially faster runtime. AVAILABILITY AND IMPLEMENTATION: The program of LinearCDSfold can be downloaded from https://github.com/ablab-nthu/LinearCDSfold.