Robust Transfer Learning for High-Dimensional GLM Using γ -Divergence With Applications to Cancer Genomics.

基于Ύ³散度的高维GLM鲁棒迁移学习及其在癌症基因组学中的应用

阅读:13
作者:Xu Fuzhi, Ma Shuangge, Zhang Qingzhao, Xu Yaqing
In the analysis of complex diseases, high-dimensional profiling data is important for assessing risks and detecting biomarkers. With the increasing accessibility of cancer genomic data, the sample sizes remain limited in most studies. Hence, borrowing information from additional data sources is thus desirable to improve estimation and prediction. Transfer learning has been demonstrated to be flexible and effective in boosting modeling performance with a record in biomedical applications. In practice, outliers and even data contamination often occur. However, existing transfer learning methods often lack robustness to outliers and data contamination, issues commonly observed in real-world biomedical data. In this study, we propose a robust transfer learning approach based on the minimum γ -divergence under a generalized linear model (GLM) framework for high-dimensional data. Our method incorporates a data-driven source detection scheme that automatically identifies informative sources while mitigating the risk of negative transfer. We establish rigorous theoretical results, including consistency and high-dimensional estimation error bounds, ensuring robustness and reliable performance. A computationally efficient algorithm is developed based on proximal gradient descent to facilitate both the transfer and debiasing steps. Simulation demonstrates the superior and competitive performance of the proposed approach in selection and prediction/classification. We further validate its practical utility by analyzing data on breast cancer and glioblastoma, showcasing the method's effectiveness in real-world high-dimensional settings.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。