Abstract
Contour delineation is crucial for ensuring the efficacy and side effects of radiotherapy (RT), but it inevitably involves inter-observer variability (IOV). Deep learning (DL) models have been used to assist in contour delineation, but further evaluation is needed to guide healthcare professionals in the judicious application of DL models. The contours of 22 anatomical structures and the gross tumor volume (GTV) for 30 patients with nasopharyngeal carcinoma were delineated using four DL models: AccuContour, RT-Viewer-contour, RT-Mind, and PVmed Contouring. The overall kappa values and generalized conformity indices of these contours were calculated to assess consistency. The Dice similarity coefficient (DSC), Relative Volume Difference (RVD), 95th percentile Hausdorff Distance (HD95), and Average Symmetric Surface Distance (ASSD) were calculated to evaluate the accuracy of the contours. Additionally, two innovative model frameworks were introduced to improve the fidelity and reliability of patient contour delineation. The consistency of the contours generated by the four DL models was poor for GTV, pituitary gland, temporal lobes, and temporomandibular joints. Marked differences were still observed between the contours generated by the models and the manual delineations by oncologists for the GTV, lens, optic nerves, pituitary glands temporomandibular joints, temporal lobes, and trachea. The model frameworks we proposed can effectively optimize the contours of GTV, brainstem, eyes, lens, and temporomandibular joints. The contours generated by DL models still have deficiencies in the application of nasopharyngeal carcinoma radiotherapy. To address this, two model frameworks were proposed to increase the robustness of automatic contouring.