Abstract
Convolutional neural networks (CNNs) have shown outstanding performance in image recognition, but their application to non-sequential tabular data remains debatable. This study investigates the architectural sensitivity of CNNs when applied to non-sequential medical datasets and compares their performance with multi-layer perceptrons (MLPs) under various structural settings. Three publicly available medical tabular datasets were used: integrative clinical and CT feature dataset (iCTCF), Breast Cancer Wisconsin Diagnostic (BCWD), and UCI Heart Disease (UCI-HD). We systematically varied the number of kernels, kernel sizes, and fully connected (FC) nodes in a 1D-CNN architecture and compared the classification performance with that of MLP models, while conducting 1,000 feature-order permutation experiments to quantify order sensitivity under randomized structural settings. Effect-size statistics were computed to describe class separability; no feature filtering was performed. Across permutations, MLPs demonstrated superior stability with significantly tighter dispersion than CNNs across all datasets. While CNNs achieved peak AUROCs comparable to (BCWD: 0.987 vs. 0.986) or higher than (iCTCF: 0.739 vs. 0.681) MLPs in certain configurations, they exhibited greater performance variability and a distinct negative skew, reflecting high sensitivity to feature ordering. In UCI-HD, the peak AUROC favored MLP (0.878 vs. 0.829). Post-hoc analyses confirmed that CNN performance is highly contingent on structural hyperparameters—particularly kernel size—rather than robust feature learning. CNN performance on tabular data is heavily dependent on arbitrary feature ordering and structural design, posing risks of stochastic degradation. Clinical AI applications using such data must prioritize stability over peak performance and account for the lack of inherent spatial structure in tabular inputs. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1038/s41598-026-39875-9.