Abstract
BACKGROUND: Podcasts have emerged as a popular medium in medical education over the past decade. Audio learning allows flexibility and may help residents engage with content in new ways. Reading scientific literature is a core skill for residents, yet many residents struggle to comprehend complex research articles. Advances in artificial intelligence (AI) have enabled the automatic generation of podcast-style summaries of documents. It remains unclear whether listening to AI-generated podcast summaries can match the educational value of reading the full text of medical papers, and whether this depends on the complexity of the article. OBJECTIVE: This study aims to compare comprehension of medical research papers when learning via an AI-generated audio podcast versus traditional reading. We will examine whether article complexity (narrative vs technical) moderates any difference. We hypothesize an interaction: for a highly complex article, residents who read the full text should achieve a better understanding than those who listen to a summary, whereas for an easier article, the difference between modalities should be smaller. METHODS: We designed a 2×2 mixed factorial study with 60 resident physicians preparing for the board certification in internal medicine or cardiology. All participants will engage with 2 peer-reviewed cardiology articles differing in complexity: a narrative case report on eosinophilic myocarditis and a technical research article on quantifying the vena contracta area using 3-dimensional echocardiography. Each participant will read 1 article and listen to an AI-generated podcast summary of the other, with the order and assignment counterbalanced to control for order effects. The podcasts are created using Google NotebookLM's experimental audio overview feature. Participants will complete a multiple-choice knowledge test for each article. The primary outcomes are comprehension scores for each modality. The secondary outcomes include intrinsic motivation, perceived learning gains, and cognitive load for each condition. Data will be analyzed using a mixed ANOVA to test the main effects of modality and article complexity, as well as their interaction. RESULTS: Data collection is expected to be completed by early 2026. We will report the trial results according to the CONSORT (Consolidated Standards of Reporting Trials) guidelines, and any deviations from this protocol will be documented and justified. No results are available at the time of publication of this protocol. CONCLUSIONS: This randomized trial will offer evidence on the effectiveness of AI-generated podcast summaries as a learning tool for medical literature. If listening to an AI-generated podcast yields comprehension comparable to or superior to reading the full article, it could validate an innovative, time-saving approach for busy medical trainees. Conversely, if significant deficits are observed in the podcast group (especially for complex content), the findings will highlight the limitations of AI summaries and the continued importance of traditional reading for thorough understanding. INTERNATIONAL REGISTERED REPORT IDENTIFIER (IRRID): PRR1-10.2196/78505.