CircCNNs, a convolutional neural network framework to better understand the biogenesis of exonic circRNAs

CircCNNs,一种用于更好地理解外显子环状RNA生物合成的卷积神经网络框架

阅读:1

Abstract

Circular RNAs (circRNAs) as biomarkers for cancer detection have been extensively explored, however, the biogenesis mechanism is still elusive. In contrast to linear splicing (LS) involved in linear transcript formation, the so-called back splicing (BS) process has been proposed to explain circRNA formation. To investigate the potential mechanism of BS via the machine learning approach, we curated a high-quality BS and LS exon pairs dataset with evidence-based stringent filtering. Two convolutional neural networks (CNN) base models with different structures for processing splicing junction sequences including motif extraction were created and compared after extensive hyperparameter tuning. In contrast to the previous study, we are able to identify motifs corresponding to well-established BS-associated genes such as MBNL1, QKI, and ESPR2. Importantly, despite prevalent high false positive rates in existing circRNA detection pipelines and databases, our base models demonstrated a notable high specificity (greater than 90%). To further improve the model performance, a novo fast numerical method was proposed and implemented to calculate the reverse complementary matches (RCMs) crossing two flanking regions and within each flanking region of exon pairs. Our CircCNNs framework that incorporated RCM information into the optimal base models further reduced the false positive rates leading to 88% prediction accuracy.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。