Abstract
PURPOSE: Registration of computed tomography (CT) to laparoscopic video images is vital to enable augmented reality (AR), a technology that holds the promise of minimising the risk of complications during laparoscopic liver surgery. Although several solutions have been presented in the literature, they always rely on an accurate initialisation of the registration that is either obtained manually or automatically estimated on very specific views of the liver. These limitations pose a challenge to the clinical translation of AR. METHODS: We propose the use of a content-based image retrieval (CBIR) framework to obtain an automatic robust initialisation to the registration. Instead of directly registering video and CT, we render a dense set of possible views of the liver from CT and extract liver contour features. To reduce feature maps to lower dimension vectors, we use a deep hashing (DH) network that is trained in a triplet scheme. Registration is obtained by matching the intra-operative image hashing encoding to the closest encodings found in the pre-operative renderings. RESULTS: We validate our method on synthetic and real data from a phantom and real patient data from eight surgeries. Phantom experiments show that registration errors acceptable for an initial registration are obtained if sufficient pre-operative solutions are considered. In seven out of eight patients, the method is able to obtain a clinically relevant alignment. CONCLUSION: We present the first work to adapt DH to the CT to video registration problem. Our results indicate that this framework can effectively replace manual initialisations in multiple views, potentially increasing the translation of these techniques.