Abstract
BACKGROUND: Inorganic arsenic (As) is a carcinogen and >200 million people worldwide are exposed to As through drinking water. Long-term As exposure increases the risk of chronic diseases including cancer and cardiovascular disease. While prior work has identified several genomic regions where DNA methylation (DNAm) is associated with As exposure, our knowledge remains limited. METHODS: We first evaluated the association of As exposure (measured in urine and drinking water) with DNAm (measured genome-wide in blood cells at >720 000 cytosine-phosphate-guanine (CpG) sites) among 1186 Bangladeshi adults from the Health Effects of Arsenic Longitudinal Study-a population with a wide range of As exposure levels. We then utilized elastic net regression to build a DNAm-based biomarker of As exposure. RESULTS: We identified 1177 CpGs associated with urinary As (false discovery rate < 0.05) and further demonstrated that these associations are likely causal by using genetic determinants of As metabolism as instrumental variables. Arsenic-associated CpGs showed strong enrichment for genomic context and CpG sets related to disease states previously linked to As exposure. We developed a 255-CpG DNAm-based biomarker of As exposure that was highly predictive of urinary As levels (r2 = 0.46) and arsenical skin-lesion status (area under the receiver-operating characteristic curve = 0.69)-a common sign of As toxicity. This biomarker was also highly associated with measured urinary As levels in an independent Bangladeshi cohort (P = 1.3 × 10-19) and further validated in the Strong Heart Study (P = 6.7 × 10-6), in which the DNAm-predicted As was also associated with overall mortality (P = 4.6 × 10-4). CONCLUSION: We demonstrate the utility of a DNAm-based biomarker of As exposure that can predict the risk of toxicity and present guidelines for developing epigenetic biomarkers to track environmental chemical exposures.