Abstract
BACKGROUND: As polypharmacy, the use of over-the-counter (OTC) drugs, and herbal supplements becomes increasingly prevalent, the potential for adverse drug-drug interactions (DDIs) poses significant challenges to patient safety and health care outcomes. OBJECTIVE: This study evaluates the capacity of Generative Pre-trained Transformer (GPT) models to accurately assess DDIs involving prescription drugs (Rx) with OTC medications and herbal supplements. METHODS: Leveraging a popular subscription-based tool (Lexicomp), we compared the risk ratings assigned by these models to 43 Rx-OTC and 30 Rx-herbal supplement pairs. RESULTS: Our findings reveal that all models generally underperform, with accuracies below 50% and poor agreement with Lexicomp standards as measured by Cohen's kappa. Notably, GPT-4 and GPT-4o demonstrated a modest improvement in identifying higher-risk interactions compared to GPT-3.5. CONCLUSION: These results highlight the challenges and limitations of using off-the-shelf large language models for guidance in DDI assessment.