ARCADE: Controllable Codon Design from Foundation Models via Activation Engineering

ARCADE:基于基础模型和激活工程的可控密码子设计

阅读:1

Abstract

Codon sequence design is crucial for generating mRNA sequences with desired functional properties for tasks such as developing mRNA vaccines or gene editing therapies. Yet existing methods lack flexibility and controllability to adapt to various design objectives. We propose a novel machine learning-based framework, ARCADE, that enables flexible and controllable multi-objective codon design. Leveraging inherent knowledge from pretrained genomic language models, ARCADE extends activation engineering, a technique originally developed for controllable text generation, beyond discrete feature manipulation such as concepts and styles, to steering continuous-valued biological metrics. Specifically, we derive biologically meaningful semantic steering vectors in the model's activation space, which directly control properties such as the Codon Adaptation Index, Minimum Free Energy, and GC content. Experimental results demonstrate the flexibility of ARCADE in designing codon sequences with multiple objectives, underscoring its potential for advancing programmable biological sequence design. Our implementation is available at https://github.com/Kingsford-Group/arcade.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。