Abstract
PURPOSE: This study aims to enhance the capabilities of large language models (LLMs) in anesthesiology decision support, leveraging a graph-based Retrieval-Augmented Generation (RAG) framework to improve analytical reasoning and deliver evidence-driven results. METHODS: Based on various clinical guidelines and textbooks, we constructed an Anesthesiology Knowledge Graph using UMLS as a reference ontology. We present a graph-based RAG framework, AnesGraph-RAG, which integrates pre-retrieval judgment, hybrid retrieval, and tailored Chain-of-Thought prompting to enhance reasoning performance. Its variant AnesGraph-RAG-UD further incorporates unfamiliarity-driven graph retrieval and structural querying to improve token efficiency. RESULTS: The constructed anesthesiology knowledge graph comprises 212313 entities and 529845 relations. Evaluation on the expert-annotated dataset confirms its high entity recall (82.19%) and precision (82.13%). Evaluations on the Anesthesiology Attending Physician Qualification Examination dataset indicate that our framework achieves superior performance over ChatGPT - 3.5-turbo, ChatGPT-4o, and DeepSeek-V3, with relative improvements of 2.4%, 7.3%, and 3.7% in average accuracy, respectively. The performance gains are particularly pronounced in domains assessing professional knowledge and professional practical skills. In addition, our case study demonstrates that the enhanced post-hoc explanation module offers reasoning paths supported by more detailed evidence. CONCLUSION: In conclusion, by integrating structured domain knowledge with adaptive reasoning mechanisms, our graph-based RAG framework improves both the accuracy and interpretability of LLM responses in anesthesiology tasks. These results demonstrate the practical value of combining domain-specific knowledge graphs with LLMs in clinical decision support.