Abstract
OBJECTIVES: To develop and test a novel machine learning approach for monitoring impact of computerized clinical decision support (CDS) tools on clinicians' electronic health record (EHR) activities. MATERIALS AND METHODS: Our CDS monitoring approach leverages topic modeling, a latent-variable statistical machine learning method, to infer health providers' EHR activities from EHR audit logs. We applied this approach to monitor the impact of a tobacco cessation support CDS tool newly implemented in 5 cancer clinics (2018-2021). We trained the topic model on EHR audit log data from 3445 encounters (pre-CDS-implementation: 1734, post-CDS-implementation: 1711) for patients with active smoking status. The number of topics was automatically determined based on within-topic coherence and across-topic divergence, and the identified topics were assigned clinically relevant EHR activity labels by 4 domain experts. RESULTS: The topic model identified 2 distinct activities focusing on CDS (act on CDS, bypass/postpone CDS), 2 activities related to CDS (review patient records and address alerts, use note templates and acknowledge the completion of CDS), 6 related to accessing (access patient station) and reviewing patient data (external records, synopsis data, snapshot of patient data, problem list/diagnosis/notes, treatment plan), and 4 related to modifying EHR (modify diagnosis/problem lists, document visit with record review, perform administrative activities for visit and billing, and document follow-up care plan). Comparing matched 1-hour after-check-in windows post-implementation (n = 841) versus pre-implementation (n = 841) of CDS, the mean prevalence (expressed as proportions out of 1.0) of providers' EHR-use activity increased on CDS-focused activities (0.073, 95% CI, 0.066-0.079) and CDS-related activities (0.098, 95% CI, 0.089-0.106) and decreased on modifying EHR (-0.113, 95% CI, -0.124 to -0.102) and reviewing patient data (-0.058, 95% CI, -0.072 to -0.044). DISCUSSION: Our topic model-based CDS monitoring approach can identify shifts in prevalence of EHR-use activities pre-implementation versus post-implementation. This approach can be applied to detect unintended changes in EHR activities on a large population scale following CDS implementation, providing valuable insights to guide focused qualitative investigations for CDS improvement or de-implementation. CONCLUSION: Our approach offers a scalable, data-driven framework for evaluating the real-world impact of EHR-embedded CDS tools. Built on a generic machine learning framework, this approach could be adapted to explore impact of other healthcare quality improvement strategies using EHR-integrated CDS interventions.