Abstract
BACKGROUND: The early diagnosis rate of pancreatic ductal adenocarcinoma (PDAC) is low and the prognosis is poor. It is important to develop an interpretable noninvasive early diagnostic model in clinical practice. AIM: To develop an interpretable noninvasive early diagnostic model for PDAC using plasma extracellular vesicle long RNA (EvlRNA). METHODS: The diagnostic model was constructed based on plasma EvlRNA data. During the process of establishing the model, EvlRNA-index was introduced, and four algorithms were adopted to calculate EvlRNA-index. After the model was successfully constructed, performance evaluation was conducted. A series of bioinformatics methods were adopted to explore the potential mechanism of EvlRNA-index as the input feature of the model. And the relationship between key characteristics and PDAC were explored at the single-cell level. RESULTS: A novel interpretable machine learning framework was developed based on plasma EvlRNA. In this framework, a two-layer classifier was established. A new concept was proposed: EvlRNA-index. Based on EvlRNA-index, a cancer diagnostic model was established, and a good diagnostic effect was achieved. The accuracy of PDACandCPvsHealth-Probabilistic PCA Index-SVM (PDAC and chronic pancreatitis vs health-probabilistic principal component analysis index-support vector machine) (1-18) was 91.51%, with Mathew's correlation coefficient 0.7760 and area under the curve 0.9560. In the second layer of the model, the accuracy of PDACvsCP-Probabilistic PCA Index-RF (PDAC vs chronic pancreatitis-probabilistic principal component analysis index-random forest) (2-17) was 93.83%, with Mathew's correlation coefficient 0.8422 and area under the curve 0.9698. Forty-nine PDAC-related genes were identified, among which 16 were known, inferring that the remaining ones were also PDAC-related genes. CONCLUSION: An interpretable two-layer machine learning framework was proposed for early diagnosis and prediction of PDAC based on plasma EvlRNA, providing new insights into the clinical value of EvlRNA.