Abstract
Most complex trait association signals reside in the noncoding genome, where defining function is challenging. MPRAs (massively parallel reporter assays) offer a scalable means to test variants' regulatory impacts but are typically cell-type agnostic, pairing cloned fragments with generic "housekeeping" promoters. To explore MPRAs' context sensitivity, we screened a panel of nearly 12,000 fragments across >300 diabetes- and metabolic-trait-associated regions in a pancreatic β cell line model. We compared activity when fragments were placed up- versus downstream of a reporter gene and combined with the synthetic housekeeping promoter super core promoter 1 (SCP1) versus the physiologically relevant human insulin (INS) gene promoter. We identified clear effects of MPRA construct design on regulatory activity. A subset of fragments (n = 702/11,656) displayed positional bias, evenly distributed across up- and downstream preferences. Promoter choice also influenced MPRA activity (n = 698/11,656), mostly biased toward the cell-specific INS promoter (73.4%). A screen for sequence annotations associated with INS promoter preference revealed enrichment for HNF1 binding motifs. HNF1 family transcription factors are key regulators of glucose metabolism disrupted in maturity-onset diabetes of the young (MODY), suggesting genetic convergence between rare coding variants that cause MODY and common type 2 diabetes (T2D)-associated regulatory regions. A follow-up HNF1-focused MPRA highlighted several instances where motif deletion or mutation disrupted regulatory activity specifically in the context of the INS1 promoter and in the β cell model but not in another diabetes-relevant cell type, skeletal muscle. These results identify technical factors that may require careful consideration while designing MPRA experiments.