Abstract
Under starvation, Myxococcus xanthus bacteria initiate a multicellular developmental program in which cells move to form fruiting bodies and differentiate into distinct cell types. Many genes affecting this process have been identified, and it is assumed that perturbing genes within the same pathway induces similar changes in the phenotype, although these changes may be subtle or obscured by pleiotropy. However, these pathways cannot be systematically mapped because there are no reliable methods for quantifying phenotype similarity. Here, we applied deep learning to quantify phenotype patterns and self-organization dynamics of 292 genetically distinct strains. We integrated ResNet and StyleGAN2 into a Variational Autoencoder and trained it together with a Siamese network that learns phenotypic similarity. This end-to-end system encoded high-resolution microscopy images into 13-dimensional feature vectors, effectively capturing variation in aggregation patterns across time and strains. Human evaluation confirmed that our model's reconstructions were visually indistinguishable from real images and closely aligned with input phenotypes. Importantly, the feature space is interpretable: Individual dimensions correlate with biological features such as aggregate number and size, and extrapolation along these dimensions produces predictable morphological changes. Remarkably, our model revealed that developmental phenotypes and ultimate aggregation fate are predictable from the earliest images before visible aggregation begins. This predictability held across both genetic and environmental sources of variation, suggesting that subtle, early-stage phenotypic signatures carry critical information about developmental trajectories. These results demonstrate how machine learning can reveal hidden aspects of complex multicellular dynamics and provide methods for phenotypic analysis without manual annotation.