Abstract
All multicellular metazoa, including both vertebrates and invertebrates, are susceptible to cancer. Cancer prevalence among vertebrate species ranges from less than 1% to more than 50% and is dependent on tissue type. Species generally more or less susceptible to cancer overall may exhibit a low or high rate of a specific tumor. Understanding the genetic or lifestyle factors which impact the relative rate of tumorigenesis in different tissues across species would support new approaches to personalized and preventative therapy. Research in this area is challenging due to heterogeneous reporting criteria including inconsistent and species-specific tumor metadata. Here we present a harmonized dataset including tumors from 41,539 individuals across 825 species. Each tumor is assigned the NCBI taxonomy ID for the species where it was found; a tissue-type ID specifying body site if applicable; and a cell type ID. Using this dataset, we demonstrate the relative incidence of breast, colon, and lung cancers among nonhuman species is typically low while soft tissue relative incidence is high. These results suggest the existence of unrecognized genetic machinery involved in tumor suppression or oncogenesis which may be key drivers of non-human cancer and likely play a role in human disease progression, including metastasis.