Abstract
PURPOSE: Reviewing pathology reports requires physicians to integrate complex histopathologic, immunohistochemical, and molecular findings from multiple reports and institutions, often under time constraints that increase the risk of error and fatigue. Large language models (LLMs) offer a potential solution by generating concise, coherent summaries from complex pathology data. METHODS: Patients who underwent initial consultation in a thoracic clinic between January 2019 and July 2023 were included. Original pathology reports and corresponding physician pathology summaries from consultation notes were extracted and anonymized. Six open-source LLMs (Llama 3.0, Llama 3.1, Llama 3.2, Mistral, Gemma, and DeepSeek-R1) generated pathology summaries directly from the original reports. Objective and subjective evaluations were performed using the original reports as the ground truth. LLM-generated summaries were compared with physician summaries for correctness, completeness, and conciseness. Additional subjective assessments with multiple evaluators were conducted for Llama 3.1. RESULTS: Ninety-four cases met the eligibility criteria. Using the original pathology reports as the ground truth, the LLM-generated summaries achieved higher scores across all objective evaluation metrics compared with physician pathology summaries (P < .0001). In the subjective evaluation, DeepSeek, Mistral, Llama 3.1, and Llama 3.2 achieved higher ratings for completeness (P = .017, P < .0001, P < .0001, and P < .0001, respectively) while maintaining comparable correctness relative to physician pathology summaries (P = 1.000). The results remained consistent in additional subjective analyses involving multiple evaluators for Llama 3.1. CONCLUSION: LLM-generated summaries demonstrated better performance in objective metrics and greater completeness in subjective evaluations compared with physician summaries. These results highlight the potential of LLMs as valuable tools for enhancing clinical documentation and workflow efficiency in oncology practice.