Abstract
Linking mothers to their newborns in health records is crucial for understanding the impact of policies, programs, and medical treatments on inter-generational health outcomes. While previous studies have used shared identifiers like names or addresses for linkage, such data are often unavailable in Medicaid records due to privacy concerns. We present a scalable framework and linking algorithm using Medicaid MAX and TAF claims data-lacking direct identifiers-that connects mothers and infants while ensuring privacy protection. Our method accommodates variations in Medicaid records over time and across states, supporting matches at different levels of stringency. Using data from all 50 states over 19 years, our algorithm linked 11.68 million mother-infant dyads, covering 68% of Medicaid-enrolled infants, over 30% of all U.S. infants. We provide our code to facilitate research on social determinants of health and the inter-generational effects of U.S. public policy.