Abstract
RATIONALE & OBJECTIVES: Accurate ascertainment of end-stage kidney disease (ESKD) in electronic health records (EHRs) data is important for much epidemiological research. This study developed and validated an algorithm using diagnosis and procedure codes to identify patients with ESKD (treated with maintenance dialysis or kidney transplantation) in EHR data. STUDY DESIGN: Study of diagnostic algorithms. SETTING & PARTICIPANTS: The development cohort included 559,615 patients treated at the Geisinger Health System (January 1996-June 2018). The validation cohort included 767,186 patients treated at New York University Langone Health System (January 2018 to December 2020). ALGORITHMS COMPARED: The algorithm used diagnosis and procedure codes compared with a nominal gold standard designation within the United States Renal Data System (USRDS) data. The performance of the algorithm was characterized by sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV). The dates of incident ESKD between the algorithm and USRDS were compared in a subset of cases. OUTCOME: ESKD (maintenance dialysis, prior recipient of a kidney transplant, or kidney transplantation surgery) cases. RESULTS: In Geisinger, we developed an ESKD algorithm that identified 4,766 (0.85%) ESKD cases; there were 5,155 (0.92%) ESKD cases reported by the USRDS. The sensitivity, specificity, PPV, and NPV of the algorithm were 73.9% (95% CI, 72.7-75.1%), 99.83% (99.82-99.84%), 79.9% (78.9-81.0%), and 99.76% (99.75-99.77%), respectively. When applying the algorithm to New York University Langone Health System data, the sensitivity, specificity, PPV, and NPV were 71.8% (95% CI, 70.7-73.0%), 99.95% (99.95-99.96%), 91.6% (90.8-92.4%), and 99.79 (99.78-99.80%), respectively. The median difference between dates of incident ESKD (algorithms minus USRDS) was-3 (IQR, -21 to 83) days for Geisinger and 0 (IQR, -12 to 69) days for New York University Langone Health. LIMITATIONS: Use of structured EHRs data only. CONCLUSIONS: Algorithms combining diagnosis and procedure codes show high specificity and modest sensitivity for identifying patients with ESKD, providing a research tool to inform future EHRs-based studies. PLAIN-LANGUAGE SUMMARY: Although electronic health records (EHRs) data holds great promise for advancing kidney research, little work has been done to accurately identify ESKD cases in these data. This study developed and validated an algorithm using diagnosis and procedure codes to identify ESKD in EHRs. Our findings showed that the algorithm performed consistently in 2 different health systems, demonstrating high specificity and negative predictive values but lower sensitivity and positive predictive value. This algorithm may inform future ESKD research using EHR data.