Abstract
OBJECTIVE: We discuss implications of potential ascertainment biases for studies examining diabetes risk following SARS-CoV-2 infection using electronic health records (EHRs). We quantitatively explore sensitivity of results to misclassification of COVID-19 status using data from the U.S.-based Diabetes in Children, Adolescents and Young Adults (DiCAYA) Network on children (≤17 years) and young adults (18-44 years). MATERIALS AND METHODS: In our retrospective case study from the DiCAYA Network, SARS-CoV-2 was identified using labs and diagnoses from June 1, 2020 to December 31, 2021. Patients were followed through December 31, 2022 for new diabetes diagnoses. Sites examined incident diabetes by COVID-19 status using Cox proportional hazards models. Results were pooled in meta-analyses. A bias analysis examined potential impact of COVID-19 misclassification scenarios on results, guided by hypotheses that sensitivity would be <50% and would be higher among those who developed diabetes. RESULTS: Prevalence of documented COVID-19 was low overall and variable across sites (children: 4.4%-7.7%, young adults: 6.2%-22.7%). Individuals with documented COVID-19 were at higher risk of incident diabetes compared to those with no documented infection, but results were heterogeneous across sites. Findings were highly sensitive to COVID-19 misclassification assumptions. Observed results could be biased away from the null under several differential misclassification scenarios. DISCUSSION: Although EHR-based documentation of COVID-19 was associated with incident diabetes, COVID-19 phenotypes likely had low sensitivity, with considerable variation across sites. Misclassification assumptions strongly impacted interpretation of results. CONCLUSION: Given the potential for low phenotype sensitivity and misclassification, caution is warranted when interpreting analyses of COVID-19 and incident diabetes using clinical or administrative databases.