Using ensembles and distillation to optimize the deployment of deep learning models for the classification of electronic cancer pathology reports

利用集成学习和知识蒸馏优化深度学习模型在电子癌症病理报告分类中的部署

阅读：1

作者：De Angeli,Kevin,Gao,Shang,Blanchard,Andrew,Durbin,Eric B,Wu,Xiao-Cheng,Stroup,Antoinette,Doherty,Jennifer,Schwartz,Stephen M,Wiggins,Charles,Coyle,Linda,Penberthy,Lynne,Tourassi,Georgia,Yoon,Hong-Jun

期刊：	JAMIA Open	影响因子：	3.400
时间：	2022	起止号：	2022 Oct;5(3):ooac075
doi：	10.1093/jamiaopen/ooac075	研究方向：	肿瘤

Abstract

OBJECTIVE: We aim to reduce overfitting and model overconfidence by distilling the knowledge of an ensemble of deep learning models into a single model for the classification of cancer pathology reports. MATERIALS AND METHODS: We consider the text classification problem that involves 5 individual tasks. The baseline model consists of a multitask convolutional neural network (MtCNN), and the implemented ensemble (teacher) consists of 1000 MtCNNs. We performed knowledge transfer by training a single model (student) with soft labels derived through the aggregation of ensemble predictions. We evaluate performance based on accuracy and abstention rates by using softmax thresholding. RESULTS: The student model outperforms the baseline MtCNN in terms of abstention rates and accuracy, thereby allowing the model to be used with a larger volume of documents when deployed. The highest boost was observed for subsite and histology, for which the student model classified an additional 1.81% reports for subsite and 3.33% reports for histology. DISCUSSION: Ensemble predictions provide a useful strategy for quantifying the uncertainty inherent in labeled data and thereby enable the construction of soft labels with estimated probabilities for multiple classes for a given document. Training models with the derived soft labels reduce model confidence in difficult-to-classify documents, thereby leading to a reduction in the number of highly confident wrong predictions. CONCLUSIONS: Ensemble model distillation is a simple tool to reduce model overconfidence in problems with extreme class imbalance and noisy datasets. These methods can facilitate the deployment of deep learning models in high-risk domains with low computational resources where minimizing inference time is required.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用；引用内容仅为补充信息，不代表本站立场。

2、若认为本页面引用内容涉及侵权，请及时与本站联系，我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容，需注明“来源：[生知库]”并获得授权；使用引用内容的，需自行联系原作者获得许可。

4、投稿及合作请联系：info@biocloudy.com。