Abstract
High expression of heterologous proteins is often achieved by integrating multiple copies of a gene into a host. However, such multicopy systems are prone to genetic instability due to homologous recombination between identical sequences. We present the multisequence ChimeraMap (MScMap), an algorithm for designing multiple synonymous coding sequences that minimizes recombination risk while maintaining high expression. MScMap extends the ChimeraMap framework by selecting diverse nucleotide blocks from a host genome to encode the target protein, balancing host adaptation and sequence dissimilarity. We introduce heuristics for block selection and concatenation to reduce long common substrings, a known driver of recombination. Our method outperforms a multi-objective evolutionary algorithm in both genetic stability and predicted expression across a wide range of human proteins while being significantly faster. We also show that MScMap can also be used to reduce sequence repeats within a single coding sequence. A web tool for single and multicopy coding sequence optimization is available online.