Abstract
Nasopharyngeal carcinoma (NPC), highly prevalent in southern China, often presents insidious early symptoms, resulting in advanced-stage diagnosis, significant treatment challenges, and poor prognosis. Accurate prediction of NPC's disease burden is essential for devising effective prevention and treatment strategies and optimizing medical resource allocation. This study used the 2021 Global Burden of Disease (GBD) study's disability-adjusted life years (DALYs) data on NPC in China. It developed 3 disease burden prediction models (ARIMA, deep neural networks [DNN], and long short-term memory [LSTM]), whose performance was assessed by mean absolute error, mean absolute percentage error, and RMSE. The DNN model demonstrated superior fitting on the training set with the lowest error metrics, yet it exhibited overfitting as its performance declined on the testing set. In contrast, the ARIMA model, with its assumption of stationarity, achieved the best generalization with the lowest mean absolute error and mean absolute percentage error on the testing data. The LSTM model recorded higher errors on the test set. Forecasts for 2022 to 2030 showed that while ARIMA predicted stable DALYs, both DNN and LSTM models indicated a gradual decline. The future predictions of all 3 models indicate that although the disease burden of NPC may gradually decrease, it remains extremely severe. This study compared ARIMA, DNN, and LSTM models for predicting NPC disease burden in China. Although DNN excelled on training data, ARIMA generalized best on testing, while LSTM struggled due to limited data. Future research should integrate diverse sources and improve interpretability to support NPC prevention and treatment.