Abstract
Slide-based lectures remain the primary means by which undergraduate students learn about the mathematical, physical, and systems-level foundations of medical imaging. However, despite their central educational role, no openly available dataset pairs imaging lecture slides with clean, well-aligned explanatory narration suitable for scientific and educational research. The authors introduced MEDI-SLATE: medical imaging slide-lecture aligned teaching ensemble, constructed from a complete undergraduate biomedical engineering medical imaging course. The dataset contains 1117 high-resolution slides paired with refined narration derived from classroom audio through automatic speech recognition, followed by careful manual cleanup. MEDI-SLATE encompasses linear systems, Fourier analysis, signal processing, X-ray physics, computed tomography, positron emission tomography/single photon emission computed tomography, magnetic resonance imaging , ultrasound, and optical imaging. In addition to the slide-text pairs, the dataset includes lecture-level difficulty tags, key ideas, common student misunderstandings, and practice questions sourced directly from the instructor's materials. A fully reproducible preprocessing pipeline covering slide extraction, narration refinement, alignment, and corpus-level analyses is provided. MEDI-SLATE offers a high-fidelity, openly available resource for medical imaging education, curriculum development, multimodal learning research, and creation of artificial intelligence-assisted instructional tools, with all data and codes released for transparent use and future extension.