Abstract
Identifying the main environmental drivers of SARS-CoV-2 transmissibility in the population is crucial for understanding current and potential future outbursts of COVID-19 and other infectious diseases. To address this problem, we concentrate on the basic reproduction number R (0), which is not sensitive to testing coverage and represents transmissibility in an absence of social distancing and in a completely susceptible population. While many variables may potentially influence R (0), a high correlation between these variables may obscure the result interpretation. Consequently, we combine Principal Component Analysis with feature selection methods from several regression-based approaches to identify the main demographic and meteorological drivers behind R (0). We robustly obtain that country's wealth/development (GDP per capita or Human Development Index) is the most important R (0) predictor at the global level, probably being a good proxy for the overall contact frequency in a population. This main effect is modulated by built-up area per capita (crowdedness in indoor space), onset of infection (likely related to increased awareness of infection risks), net migration, unhealthy living lifestyle/conditions including pollution, seasonality, and possibly BCG vaccination prevalence. Also, we argue that several variables that significantly correlate with transmissibility do not directly influence R (0) or affect it differently than suggested by naïve analysis.