Error Reduction in Leukemia Machine Learning Classification With Conformal Prediction

利用保角预测降低白血病机器学习分类的误差

阅读:2

Abstract

PURPOSE: Recent advances in machine learning have led to the development of classifiers that predict molecular subtypes of acute lymphoblastic leukemia (ALL) using RNA-sequencing (RNA-seq) data. Although these models have shown promising results, they often lack robust performance guarantees. The aim of this study was three-fold: to quantify the uncertainty of these classifiers, to provide prediction sets that control the false-negative rate (FNR), and to perform implicit error reduction by transforming incorrect predictions into uncertain predictions. METHODS: Conformal prediction (CP) is a distribution-agnostic framework for generating statistically calibrated prediction sets whose size reflects model uncertainty. In this study, we applied an extension called conformal risk control to three RNA-seq ALL subtype classifiers. Leveraging RNA-seq data from 1,227 patient samples taken at diagnosis, we developed a multiclass conformal predictor ALLCoP, which generates statistically guaranteed FNR-controlled prediction sets. RESULTS: ALLCoP was able to create prediction sets with specified FNR tolerances ranging from 7.5% to 30%. In a validation cohort, ALLCoP successfully reduced the FNR of the ALLIUM RNA-seq ALL subtype classifier from 8.95% to 3.5%. For patients whose subtype was not previously known, the use of ALLCoP was able to reduce the occurrence of empty predictions from 37% to 17%. Notably, up to 34% of the multiple-class prediction sets included the PAX5alt subtype, suggesting that increased prediction set size may reflect secondary aberrations and biological complexity, contributing to classifier uncertainty. Finally, ALLCoP was validated on two additional RNA-seq ALL subtype classifiers, ALLSorts and ALLCatchR. CONCLUSION: Our results highlight the potential of CP in enhancing the use of oncologic RNA-seq subtyping classifiers and also in uncovering additional molecular aberrations of potential clinical importance.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。