Abstract
PURPOSE: This study investigates the adaptation of DeepInsight for cancer subtype classification using high-dimensional gene expression data. Originally designed for non-image data, DeepInsight has been adapted for cancer classification. METHODS: We evaluated DeepInsight's performance against several models, including support vector machines, LightGBM, neural networks, and decision trees, with and without the application of the Synthetic Minority Oversampling Technique. The study utilized gene expression data from breast, lung, and colon cancers. A novel multi-class feature selection method was introduced, using modified aggregated class activation maps to identify key genes across different cancer subtypes. These critical genes were further analyzed through Gene Ontology to explore their roles in significant biological processes. RESULT: DeepInsight consistently outperformed traditional models in terms of F1 score across breast, lung, and colon cancer datasets, effectively addressing multi-class classification challenges. Notably, several top genes were identified as significant across multiple methods. Furthermore, we conducted a Gene Ontology analysis on the critical genes, including the top genes identified by DeepInsight and the common genes recognized through multiple methods. CONCLUSION: The adaptation of DeepInsight provides an approach to cancer subtype classification by transforming high-dimensional gene expression data into image representations. Utilizing aggregated class activation maps, it effectively identifies critical pixels within these images, enabling the discovery of distinct genes that may not be highlighted by other methods. DeepInsight demonstrates potential as a valuable tool for classifying cancer subtypes and critical genes.