siRNA Features-Automated Machine Learning of 3D Molecular Fingerprints and Structures for Therapeutic Off-Target Data

siRNA特征——用于治疗脱靶数据的3D分子指纹和结构的自动化机器学习

阅读:3

Abstract

Chemical modifications are the standard for small interfering RNAs (siRNAs) in therapeutic applications, but predicting their off-target effects remains a significant challenge. Current approaches often rely on sequence-based encodings, which fail to fully capture the structural and protein-RNA interaction details critical for off-target prediction. In this study, we developed a framework to generate reproducible structure-based chemical features, incorporating both molecular fingerprints and computationally derived siRNA-hAgo2 complex structures. Using an RNA-Seq off-target study, we generated over 30,000 siRNA-gene data points and systematically compared nine distinct types of feature representation strategies. Among the datasets, the highest predictive performance was achieved by Dataset 3, which used extended connectivity fingerprints (ECFPs) to encode siRNA and mRNA features. An energy-minimized dataset (7R), representing siRNA-hAgo2 structural alignments, was the second-best performer, underscoring the value of incorporating reproducible structural information into feature engineering. Our findings demonstrate that combining detailed structural representations with sequence-based features enables the generation of robust, reproducible chemical features for machine learning models, offering a promising path forward for off-target prediction and siRNA therapeutic design that can be seamlessly extended to include any modification, such as clinically relevant 2'-F or 2'-OMe.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。