Abstract
Evidence syntheses are valuable sources of robust and transparent knowledge that can identify gaps in research and inform evidence-based decision making. However, the process of synthesis is time consuming and costly. We investigate a new AI-based method that uses a large-language model (LLM) grounded in ontologies (i.e. structured machine-interpretable glossaries of domain terminology) to extract information from a set of 80 articles on coastal wetland restoration outcomes. We evaluated this method by comparing human-extracted data with data extracted by OntoGPT — a Python package that combines an LLM with ontologies to extract structured information. We found that OntoGPT achieved 65% average agreement with human reviewers but varied based on information type requested for extraction. The highest agreement scores were found when extracting standardized information, and lower agreement scores were found for study-specific and interpretation-heavy information. Precision and recall — two common measurements of artificial intelligence performance — were 58% and 57%. Our results highlight the potential for LLMs to save some labour in the evidence synthesis process but highlight core challenges (e.g., complex information; subjective judgments) where further development is needed. While LLMs cannot replace human reviewers, they have the potential to assist in data extraction. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1186/s13750-026-00381-0.