Abstract
BACKGROUND: Attention deficit hyperactivity disorder (ADHD) affects 5-7.2% of children and 2.5% of adults. Despite its prevalence, ADHD remains underdiagnosed and undertreated, leading to significant challenges for affected individuals. Early diagnosis and intervention can prevent adverse outcomes and improve quality of life. METHODS: We developed a predictive model to identify adults with ADHD using electronic health records. The dataset comprised 2,973 adult patients (aged 18 years and above) diagnosed with ADHD and a control group of 4,447 adults referred to psychologists with no ADHD diagnosis. A transformer-based architecture was implemented, utilizing only clinical codes and gender as input features. Fivefold cross-validation was adopted, and model performance was evaluated on held-out test data consisting of 800 patients, 400 of whom had an ADHD diagnosis. RESULTS: Our study demonstrated the ability to predict adult ADHD using clinical data, with a 6-month model achieving an area under the receiver operating characteristic curve (AUC) of 0.79 (95% confidence interval: 0.76-0.81), F1-score of 0.79, sensitivity of 0.80, and specificity of 0.77. Shapley Additive Explanations identified key contributing codes, including F158 and Y903, consistent with known associations between ADHD and substance use. CONCLUSIONS: Our findings show that machine learning can effectively use clinical codes and demographic data from routine EHRs to support early, cost-efficient diagnosis of adult ADHD, paving the way for earlier intervention and improved outcomes.