Abstract
Cell tracking in chimeric models is essential yet challenging in developmental biology, regenerative medicine, and transplantation research. Current methods like fluorescent labeling and genetic barcoding are technically demanding, costly, and often impractical for dynamic tissues. We present CellSexID, a computational framework that uses sex as a surrogate marker for cell-origin inference. By training machine-learning models on single-cell transcriptomic data, CellSexID accurately predicts individual cell sex, enabling in silico distinction between donor and recipient cells in sex-mismatched settings. The model identifies minimal sex-linked gene sets through ensemble feature selection and has been validated using public datasets and experimental flow sorting, confirming biological relevance. We demonstrate CellSexID's applicability beyond chimeric models, including organ transplantation and sample demultiplexing. As a practical alternative to physical labeling, CellSexID facilitates precise cell tracking and supports diverse biomedical applications where mixed cellular populations need to be distinguished.
