Abstract
Identifying reproducible biomarkers and cancer subtypes from large-scale multiomics data remains a challenging task, as current computational frameworks often require coding expertise and lack standardized datasets for model benchmarking and exploration. To address these limitations, we developed CancerSubtypeXplore, a modular and user-friendly platform for multiomics cancer subtype prediction and biomarker discovery. The system integrates 4 main components: (a) a dataset module providing standardized and curated multiomics datasets from 17 The Cancer Genome Atlas cancer types, including mRNA, DNA methylation, and microRNA profiles; (b) a machine learning module for automated benchmarking using classical algorithms such as support vector machines and random forests; (c) design your deep learning model module that allows users to design and train customized neural network architectures without coding; and (d) a biomarker analysis module that extracts prediction-contributed features as biomarkers from each trained model, computes their intersections, and ranks them by frequency to identify robust cross-model or cross-cancer biomarkers. Benchmark experiments demonstrate consistent subtype prediction accuracy across multiple cancer types and reveal overlapping biomarkers that may serve as potential pan-cancer signatures. CancerSubtypeXplore provides a transparent, reproducible, and extensible environment for biomedical researchers to explore multiomics datasets, evaluate diverse models, and identify robust biomarkers.