A Deep Autoencoder Compression-Based Genomic Prediction Method for Whole-Genome Sequencing Data

一种基于深度自编码器压缩的全基因组测序数据基因组预测方法

阅读:1

Abstract

Genomic prediction using whole-genome sequencing (WGS) data is challenged by the imbalance between a limited sample size (n) and an extensive number of single-nucleotide polymorphisms (SNPs) (p), where n ≪p. The high dimensionality of WGS data also increases computational demands, limiting its practical application. In this study, we introduce DAGP, a novel method that integrates deep autoencoder compression to reduce WGS data dimensionality by over 99% while preserving essential genetic information. This compression significantly improves computational efficiency, facilitating the effective use of high-dimensional genomic data. Our results demonstrated that DAGP, when combined with the genomic best linear unbiased prediction (GBLUP) method, maintained prediction accuracy comparable to WGS data, even at reduced marker densities of 50 K for sturgeon and 20 K for maize. Furthermore, integrating DAGP with Bayesian and machine learning models improved genomic prediction accuracy over traditional WGS-based GBLUP, with an average gain of 6.05% and 5.35%, respectively. DAGP provides an efficient and scalable solution for genomic prediction in species with large-scale genomic data, offering both computational feasibility and enhanced prediction performance.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。