Abstract
Crohn's Disease (CD) is a chronic autoinflammatory disease of the gastrointestinal tract. Anatomical labels like Ileal Crohn's Disease (ICD) and Colonic Crohn's Disease (CCD) do not capture the molecular heterogeneity which contributes to trial and error therapy. This trial and error pattern costs patients who switch biologics higher annual expenses. We analyzed bulk RNA-seq from 2353 biopsies across two independent data sets (GSE193677 and GSE57945) using a standardized pipeline. Principal component analysis confirmed clear molecular separation between ICD and CCD samples. Differential expression modeling (DESeq2, FDR ≤ 0.05) identified the top 300 differentially expressed genes (DEGs) across subtype specific signatures. Pathway analysis confirmed known subtype biology, with ICD driven by autophagy-related processes and CCD by immune activation pathways. Subtype-specific PPI networks diverged sharply, with CKB driving barrier-related processes in ICD and SPP1 coordinating immune activation in CCD. Known CD susceptibility genes (e.g., NOD2, ATG16L1, IL23R) were recovered within leading-edge sets, supporting construct validity. Proteomic validation using ProteomeXchange PXD012284 confirmed concordant enrichment of ICD-associated autophagy and lysosomal modules and CCD-associated innate immune pathways at the protein level. Single-cell transcriptomic validation further localized leading-edge genes to epithelial lineages in ICD and to myeloid and glial populations in CCD, supporting cellular specificity of subtype programs. Together, these results indicate that ICD and CCD are biologically distinct at the transcriptome, proteome, and network levels. The prioritized hubs and pathways nominate tractable, subtype-specific hypotheses for prospective validation and provide a framework for precision therapeutics in CD.