Abstract
BACKGROUND: Tracking the evolution of scientific knowledge is challenging due to the scale and complexity of the biomedical literature. Neurosyphilis is a clinically complex and historically stigmatized condition that remains difficult to diagnose and manage. Its underexplored literature offers an ideal test case to evaluate digital methods for mapping research trends and identifying knowledge gaps. We aim to assess how large language models (LLMs), network analysis, and interrupted time series analysis (ITSA) can be combined to automate literature classification and examine how knowledge of neurosyphilis has evolved. METHODS: We systematically searched Web of Science, Embase, PubMed Central, the Cochrane Library, and Lens for records on neurosyphilis published until December 31, 2024. We included records with available titles and abstracts in which GPT-4o mini was identified as being focused primarily on syphilis or neurosyphilis. Eligible records were classified into 23 research fields via LLM-based prompts. Network analysis visualized changes in research structures over time, and the ITSA assessed associations between publication trends and major clinical or technological milestones. RESULTS: Among the 14 934 retrieved records, 4 646 met the inclusion criteria. LLM-based classification showed high repeatability (agreement = 99·67%, 95% CI 99·47–99·80; Cohen’s κ = 0·99, 95% CI 0·96–1·00). Biomedical, Clinical, and Health sciences were the most common domains. Network analysis revealed a shift from dense, discipline-specific clusters to larger interdisciplinary structures. ITSA revealed significant increases in publication activity following the introduction of penicillin G, HIV emergence, genome sequencing of Treponema pallidum, and the rise of digital dissemination platforms. CONCLUSIONS: Combining LLMs with bibliometric and network methods provides a scalable framework for analyzing large-scale biomedical literature. When applied to neurosyphilis, the approach revealed links between research activity and clinical and technological advances. In addition to this case study, the method could support meta-research and inform evidence-based decision-making across other complex medical conditions. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s12874-025-02750-8.