Abstract
BACKGROUND: Evidence suggests the existence of nonlinearity in the relationship between long-term fine particulate matter (PM(2.5)) and mortality, and the methods to flexibly incorporate nonlinearity can be improved. To heuristically evaluate the necessity of incorporating machine-learning algorithms, we compared the benefit of reducing long-term PM(2.5) on mortality estimated from three analytical methods with varying flexibility and complexity. METHODS: Using a cohort of the Canadian Community Health Survey respondents (followed from 2005 until 2014), we obtained consented respondents' baseline characteristics, time-varying annual average PM(2.5) in the previous 3 years, yearly income and neighborhood characteristics, and vital status. We estimated the 10-year cumulative mortality rate under both a natural-course exposure and a hypothetical dynamic intervention, which would set the respondent's exposure to 8.8 μg/m(3) (current Canadian annual PM(2.5) standard) if higher. We compared estimates of three analytical methods and mean squared errors under a range of hypothetical true values. RESULTS: Among 62,365 participants, the 10-year cumulative mortality rate differences per 1000 participants were -0.23 (95% confidence intervals: -0.46, 0.00), -0.83 (-1.24, -0.43), and -0.67 (-1.27, -0.06) for parametric g-computation, targeted minimum loss-based estimator using parametric models, and targeted minimum loss-based estimator with SuperLearner and six candidate algorithms of high flexibility, respectively. Changing the hyperparameters did not meaningful change estimates or algorithm weights. CONCLUSIONS: All three methods of reducing long-term exposure to PM(2.5) yielded tangible public health benefits in Canada where PM(2.5) levels are among the lowest worldwide. However, the advantage of employing machine-learning algorithms with a doubly robust estimator remains minimal, especially considering the variance-bias tradeoff.