Abstract
Carbon monoxide dehydrogenases containing nickel-iron active sites ([NiFe]-CODHs) catalyze the reversible oxidation of CO to CO(2), representing key targets for biocatalytic CO(2) reduction. Despite dramatic differences in catalytic rates and O(2) tolerance between CODH variants, the molecular basis for this functional diversity remains poorly understood. We applied comparative genomics and synteny analysis to investigate the biochemical roles of CODH clades A-F using 1376 CODH and 1545 hybrid cluster protein sequences. Around 30% of genomes encode multiple CODH isoforms. Analysis revealed distinct gene clustering patterns correlating with biochemical function. Clades A, E, and F exhibit a degree of distributional exclusivity. Clades C and D frequently co-occur with active CODHs, suggesting auxiliary roles. Operon architecture analysis revealed functional specialization: clade A links to acetyl-CoA synthase; clades A, E, and F contain essential maturation machinery (CooC, CooJ, CooT) correlating with catalytic activity; clade B associates with transporters; clade C with electron transfer partners; clade D with transcriptional regulators. High CODH-HCP co-occurrence (except clade A) suggests functional or environmental interdependency. These findings establish clades A, E, and F as primary biocatalyst targets while defining regulatory functions for clades C and D, providing a genomics framework for predicting CODH phenotypes.