Abstract
This study examines the application of WildfireGPT for wildfire forecasting, focusing on its limitations in quantitative predicting Fire Radiative Power (FRP) spread and comparing its performance with a specialized predictive model based on TabNet. While WildfireGPT is widely accessible and convenient for wildfire-related discussions, it lacks the specialized training, real-time data integration, and algorithmic precision required for reliable wildfire forecasting. To highlight these shortcomings, we conducted an experiment using real-world NASA Fire Radiative Power (FRP) datasets. Our TabNet-based model, trained on variables such as Vapor Pressure Deficit (VPD), temperature (T), pressure (P), and Fire Weather Index (FWI), demonstrated high correlation, with low Mean Absolute Error (MAE) and Mean Squared Error (MSE) in forecasting FRP values. In contrast, RAG (retrieval-augmented generation) and LLM (large language model)-based chatbots like WildfireGPT have unreliable performance on quantitative FRP forecasting with the same input data as prompts. The findings underscore the potential risks of over-reliance on general-purpose AI tools like WildfireGPT for quantitative modeling tasks in wildfire management. This study advocates for informed usage of AI tools, emphasizing the necessity of domain-specific models for accurate and actionable wildfire forecasting.