Abstract
Non-coding variants discovered by genome-wide association studies (GWASs) are enriched in regulatory elements harboring transcription factor (TF) binding motifs, strongly suggesting a connection between disease association and the disruption of cis-regulatory sequences. Occupancy of a TF inside a region of open chromatin can be detected in ATAC-seq where bound TFs block the transposase Tn5, leaving a pattern of relatively depleted Tn5 insertions known as a "footprint." Here, we sought to identify variants associated with TF binding, or "footprint quantitative trait loci" (fpQTLs), in ATAC-seq data generated from 170 human liver samples. We used computational tools to scan the ATAC-seq reads to quantify TF binding likelihood as "footprint scores" at variants derived from whole-genome sequencing generated in the same samples. We tested for association between genotype and footprint score and observed 809 fpQTLs associated with footprint-inferred TF binding (FDR < 5%). Given that Tn5 insertion sites are measured with base-pair resolution, we show that fpQTLs can aid GWAS and QTL fine-mapping by precisely pinpointing TF activity within broad trait-associated loci where the underlying causal variant is unknown. Liver fpQTLs were strongly enriched across ChIP-seq peaks, liver expression QTLs (eQTLs), and liver-related GWAS loci, and their inferred effect on TF binding was concordant with their effect on underlying sequence motifs in 78% of cases. We conclude that fpQTLs can reveal causal GWAS variants, define the role of TF binding-site disruption in complex traits, and provide functional insights into non-coding variants, ultimately informing novel treatments for common diseases.