Abstract
Single-cell genomics give us a new perspective to understand multivariate phenotypic and genetic effects at the cellular level. Recently, technologies have started measuring different modalities of individual cells, such as transcriptomes, epigenomes, metabolomes, and spatial profiling. However, integrating the results of multimodal single-cell data to identify cell-to-cell correspondences remains a challenging task. Our viewpoint emphasizes the importance of data integration at a biologically relevant level of granularity. Furthermore, it is crucial to take into account the inherent discrepancies between different modalities in order to achieve a balance between biological discovery and noise removal. In this article, we give a systematic review for the most popular single-cell integration methods and models involving cell label transfer, data visualization, and clustering task for downstream analysis. We further evaluate more than 10 popular integration methods on paired and unpaired gold standard datasets. Moreover, we discuss the data preferences of the limitations, applications, challenges and future directions of these methods.