Abstract
BACKGROUND: In radiotherapy, 2D orthogonally projected kV images are used for patient alignment when 3D-on-board imaging (OBI) is unavailable. However, tumor visibility is constrained due to the projection of patient's anatomy onto a 2D plane, potentially leading to substantial setup errors. In treatment room with 3D-OBI such as cone beam CT (CBCT), the field of view (FOV) of CBCT is limited with unnecessarily high imaging dose. A solution to this dilemma is to reconstruct 3D CT from kV images obtained at the treatment position. METHODS: We propose a dual-models framework built with hierarchical ViT blocks. Unlike a proof-of-concept approach, our framework considers kV images acquired by 2D imaging devices in the treatment room as the solo input and can synthesize accurate, full-size 3D CT within milliseconds. RESULTS: We demonstrate the feasibility of the proposed approach on 10 patients with head and neck (H&N) cancer using image quality (MAE: < 45HU), dosimetric accuracy (Gamma passing rate ((2%/2 mm/10%): > 97%) and patient position uncertainty (shift error: < 0.4 mm). CONCLUSIONS: The proposed framework can generate accurate 3D CT faithfully mirroring patient position effectively, thus substantially improving patient setup accuracy, keeping imaging dose minimal, and maintaining treatment veracity.