Abstract
BACKGROUND: Single-cell RNA sequencing (scRNA-seq) technology enables an in-depth understanding of cellular transcriptome heterogeneity and dynamics. However, a key challenge in scRNA-seq analysis is the dropout events, wherein certain expressed transcripts remain undetected. Dropouts seriously affect the accuracy and reliability of downstream analysis. Therefore, there is an urgent need to develop an effective imputation method that can accurately impute the missing values to mitigate their adverse effects on scRNA-seq analysis. METHODS: We proposed a bidirectional autoencoder-based model (BiAEImpute) for dropout imputation in scRNA-seq dataset. This model employs row-wise autoencoders and column-wise autoencoders to respectively learn cellular and genetic features during the training phase. The synergistic integration of these learned features is then utilized for the imputation of missing values, enhancing the robustness and accuracy of the imputation process. RESULTS: Evaluations conducted on four real scRNA-seq datasets consistently indicate that BiAEImpute exhibits superior performance compared to existing imputation methods. BiAEImpute adeptly restores missing values, facilitates the clustering of cell subpopulations, refines the identification of marker genes, and aids the inference of cell developmental trajectory. CONCLUSION: BiAEImpute proves to be efficacious and resilient in the imputation of missing data in scRNA-seq, contributing to enhanced accuracy in downstream analyses. The source code of BiAEImpute is available at https://github.com/LiuXinyuan6/BiAEImpute .