Abstract
MOTIVATION: Patients with intermediate stage hepatocellular carcinoma (HCC) require repeated disease monitoring, prognosis assessment, and treatment planning. A novel machine learning model called survival path mapping (SP) model was developed, while its performance as compared with conventional machine learning models remains unknown. Between January 2007 and December 2018, the time-series data of 2644 intermediate stage HCC patients from four medical centers in China were reviewed and included. Static machine learning models by Gaussian Naive Bayes (GNB), support vector machine (SVM), and random forest (RF) for the prediction of survivorship were built based on data at initial admission. Longitudinal data divided into different time slices were utilized for the construction of the SP model. The time-dependent c-index was compared between models. RESULTS: The training set, internal testing set, and external testing set consisted of 1560, 670, and 414 HCC patients, respectively. The survival path model had superior or non-inferior performance in prognosis prediction compared to GNB and RF models since the 12th month after initial diagnosis in the training set and the external testing set. The survival path model had higher time-dependent c-index over all conventional ML models since the 6th month in the external testing cohort. In conclusion, the survival path model had superior performance in long-term dynamic prognosis prediction compared to conventional static machine learning models for intermediate stage HCC. AVAILABILITY AND IMPLEMENTATION: The parameters of models are provided in the manuscript.