Abstract
Gridded population datasets represent high-resolution distributions of human occupancy, enabling informed decision-making across a broad range of fields. These data products are valuable for assessing environmental risk, urban development, disaster preparedness and resource allocation-areas where accurate population estimates directly enhance policy effectiveness and optimize resource distribution. Despite the importance of gridded population datasets, traditional population modeling approaches often overlook inherent uncertainties in the estimation process. This limitation can create a false sense of certainty in population estimates, potentially leading to flawed decisions by those who rely on the data. To address this methodological gap, we introduce a probabilistic machine learning modeling framework, LandScan Mosaic, that explicitly incorporates uncertainty into the population modeling process. Our approach systematically quantifies uncertainty in three key modeling parameters of the LandScan HD gridded population dataset: building use types, floor counts, and occupancy rates. By employing Monte Carlo simulations, we propagate these uncertainties through the modeling process, yielding probability distributions of population counts in place of deterministic point estimates. We demonstrate the practical application of this framework in Iloilo City, Philippines, using structured decision-making techniques and our probabilistic estimates to identify and prioritize areas most affected by projected flooding, supporting targeted interventions that address both economic and social risks. In doing so, we propose a population-specific approach for incorporating confidence into structured decision making processes. Through a comparative analysis with conventional deterministic approaches and point estimate approaches, including LandScan HD and WorldPop, we evaluate how the incorporation of machine learning and uncertainty influences decision rankings. This research advances population distribution modeling by offering a robust, quantitative approach that explicitly accounts for uncertainty in the underlying data, along with guidance for how users can apply uncertainty in their decision-making.