Abstract
BACKGROUND: Accurate algorithms are needed to support real-world studies of medication safety in pregnancy. Kharbanda et al. developed and validated algorithms for congenital malformations that incorporate death data and diagnosis codes from the ICD-9-CM and ICD-10-CM coding eras. OBJECTIVES: To modify an EHR-based malformation algorithm for use with claims data and quantify the impact of missing death by calculating prevalence and sensitivity. METHODS: Using the MarketScan Commercial Database (2007-2022), we established a linked cohort of birthing parents and their liveborn infants, and a claims subcohort restricted to years with inpatient death information. We established a cohort of liveborn infants in Kaiser Permanente Washington (KPWA) integrated EHR/claims data (2007-2022) that included comprehensive death information. We applied the validated algorithm to identify 22 malformations in MarketScan and 7 in KPWA (those with a prevalence ≥ 10 per 10,000 live births in MarketScan). In MarketScan, we calculated malformation prevalence with and without death information. We assessed the contribution of death on malformation identification by calculating sensitivity (with death as the gold standard). RESULTS: Among 2,203,328 infants in the MarketScan cohort, malformation prevalence was 201.3 per 10,000 live births. In the MarketScan subcohort (n = 1,287,384), prevalence was 198.2 and 199.1 per 10,000 live births without and with death information, respectively. Among the most prevalent malformations, estimated sensitivity ranged from 95.8% for severe cardiac defects to 100.0% for intestinal atresia or stenosis, pyloric stenosis and limb deficiency (claims/EHR cohort) and from 98.6% for severe cardiac defects to 100.0% for intestinal atresia or stenosis and pyloric stenosis (claims subcohort). Limitations include the use of an imperfect gold standard and a lack of chart review. CONCLUSIONS: We adapted a validated malformation algorithm for use with claims data. Omitting death information did not meaningfully impact sensitivity, suggesting this algorithm can be applied to data sources lacking death information.