Abstract
OBJECTIVE: Ulcerative colitis (UC) is a chronic inflammatory bowel disease for which remission is dependent on corticosteroid (CS) treatment. The diversity of disease pathophysiology necessitates optimal case-specific treatment selection. This study aimed to identify prognostic factors for refractory UC using a machine learning model based on nationwide registry data. METHODS: The study included 4003 patients with UC with a Mayo score of ≥3 at the time of registration who had been using CS since their entry out of 79,096 newly registered UC cases in a nationwide registry from April 2003 to March 2012 (before the widespread use of biologic agents in Japan) with 3-year data. A pointwise linear (PWL) model was used for machine learning. RESULTS: A PWL model, which was developed to predict long-term remission (lasting >3 years), had an area-under-the-curve (AUC), precision rate, recall rate, and F-value of 0.774, 0.55, 0.70, 0.62, respectively, in the test dataset from the time of registration to 2 years later. Furthermore, the presence of pseudopolyps at the time of registration was significantly and negatively correlated with remission, highlighting its importance as a prognostic factor. CONCLUSIONS: In this study, we constructed a highly accurate prognosis prediction model for UC, in which inflammation persists for an extensive period, by training a machine learning model for long-term disease progression. The results showed that machine learning can be used to determine the factors affecting remission during the treatment of refractory UC.