fuseMLR: an R package for integrative prediction modeling of multi-omics data

fuseMLR:一个用于多组学数据整合预测建模的 R 软件包

阅读:2

Abstract

BACKGROUND: Recent technological advances have enabled the simultaneous collection of multi-omics data, i.e., different types or modalities of molecular data. Integrative predictive modeling of such data is particularly challenging. Ideally, data from the different modalities are measured in the same individuals, allowing for early or intermediate integrative techniques. However, they are often not applicable when patient data only partially overlap, which requires either reducing the datasets or imputing missing values. Additionally, the diversity of data modalities may necessitate specific statistical methods rather than applying the same method across all modalities. Late integration modeling approaches analyze each data modality separately to obtain modality-specific predictions. These predictions are then aggregated into a meta-model by training a machine learning (ML) model, or by computing the weighted mean of modality-specific predictions. RESULTS: We introduce the R package fuseMLR for late integration prediction modeling. The package is user-friendly, enables variable selection and the application of different ML algorithms for each modality, and automatically performs aggregation once modality-specific training is completed. We illustrate the package’s functionality in a small simulation study and with two publicly available multi-omics datasets from The Cancer Genome Atlas. CONCLUSION: The package fuseMLR enables predictive modeling with late integration in a systematic, structured, and reproducible way. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12859-025-06248-4.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。