Abstract
Early detection of bladder cancer poses a major challenge for liquid biopsy due to limited tumor burden and low abundance of tumor-derived DNA. In such low-signal settings, detection sensitivity critically depends on both biofluid selection and effective integration of weak, distributed molecular signals. We analyzed Enzymatic Methyl-seq (EM-seq) data on 41 matched urine-plasma pairs, which demonstrated that urine samples exhibited significantly higher tumor fractions and greater concordance with tissue methylation profiles than plasma. Based on this observation, we developed a urine-based bladder cancer detection framework using EM-seq. We profiled 143 urine samples (68 bladder cancer and 75 healthy controls) and 14 bladder cancer tissues. Methylation markers (113,052 regions) were identified by comparing cancer tissues (n = 14) with urine from healthy individuals (n = 14). Using XGBoost, possible features and their combinations were evaluated, with the combination of methylation and copy number variations (CNV) yielding the best performance as the final ensemble model. When evaluated on an independent test set, the model achieved 91.9% sensitivity at 80% specificity, with an area under the curve (AUC) of 0.932 for bladder cancer detection and 0.928 for non-muscle invasive bladder cancer (NMIBC) detection. Notably, the model successfully detected four of seven mutation-negative cases, demonstrating complementary value to mutation-based approaches.