Abstract
SARS-CoV-2 infection of humans and its spillover into new species illustrates the need for effective predictive tools designed to identify host ranges of emerging zoonotic viruses. We hypothesized that host receptor sequence similarity across animals could be leveraged to identify species at risk for cross-species virus transmission. We developed a flexible computational pipeline, Multi-reference Similarity Analysis of Receptor Sequences (MrSARS), to compare homologous virus-binding receptor sequences to multiple reference host sequences. MrSARS applies an aggregate similarity score to each examined receptor sequence and categorizes them as highly or moderately susceptible, or resistant to infection. We used the sarbecovirus receptor angiotensin converting enzyme 2 (ACE2) as a model to test MrSARS predictions because ACE2 has been extensively sequenced, and its interaction with SARS-CoV and SARS-CoV-2 glycoproteins characterized. We analyzed 825 vertebrate ACE2 sequences and determined that primates and even-toed ungulates ranked highest among susceptible species. We tested these predictions by infecting 293T cells, transiently expressing ACE2 from confirmed, putatively susceptible, and resistant species, with SARS-CoV-2 variant and SARS-CoV-related glycoprotein-pseudotyped VSV reporter viruses. These experiments correlated with MrSARS susceptibility predictions and existing literature. Our study illustrates that receptor sequence information from multiple susceptible species can identify potential host ranges of circulating viruses.