Predicting somatic mutation origins in cell-free DNA by semi-supervised GAN models

利用半监督 GAN 模型预测无细胞 DNA 中的体细胞突变起源

阅读:1

Abstract

MOTIVATION: Distinguishing between pathogenic cancer-associated mutations and other somatic variants present in cell-free DNA (cfDNA) is one of the challenges in the field of liquid biopsy. This distinction is critical, since the misclassification of mutations stemming from clonal hematopoiesis (CH) as tumor-derived and vice versa could result in inaccurate diagnoses and inappropriate therapeutic interventions for patients. RESULTS: We addressed this by developing a specialized machine learning technique to differentiate tumor- or CH-related mutations in cfDNA. We established a comprehensive in-house reference catalog, comprising approximately 25,000 single nucleotide variants (SNVs), each linked to either tumor or CH origin. This reference serves as a foundation for training a deep learning model, which is structured on the semi-supervised generative adversarial network (SSGAN) architecture. By analyzing genomic coordinates and nucleotide composition of cfDNA variants, our model attains 95 % area under the curve (AUC) in classifying uncharacterized variants as CH or tumor-derived. In conclusion, our research emphasizes the potential of genomic feature prediction, using cfDNA data, to stand as a robust alternative to conventional multi-analyte sequencing methods. This approach not only enhances the accuracy of distinguishing CH from tumor mutations in liquid biopsy data, but also highlights the potential of advanced data analysis techniques and machine learning in genomics and personalized medicine. Availability: https://github.com/FPalizban/SSGAN.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。