Enhanced composed fashion image retrieval with a multi-hop reasoning framework

基于多跳推理框架的增强型合成时尚图像检索

阅读:1

Abstract

This paper introduces an Multi-Hop Reasoning Framework for Composed Fashion Image Retrieval (CFIR), meticulously designed to overcome the inherent limitations posed by existing single-step and hierarchical retrieval methods when dealing with complex multimodal queries in CFIR. Traditional CFIR approaches often struggle to accurately interpret the intricate interplay between textual descriptions and visual content within fashion datasets. Our methodology harnesses the power of multi-hop reasoning to iteratively refine the retrieval process, thereby enabling a deeper and more nuanced integration of visual and textual data. This structured approach not only enhances the model's interpretative capabilities but also significantly improves its ability to discern subtle relationships between reference and target images across various modification descriptions. By incorporating multiple reasoning steps, the framework adeptly manages the compositionality inherent in fashion-related queries, resulting in superior retrieval accuracy and performance. We thoroughly validate our approach through rigorous experiments on three extensive fashion image datasets, including Fashion-IQ, Shoes, and Fashion200k. The results demonstrate marked improvements over state-of-the-art methods, highlighting the potential of our multi-hop reasoning framework to set a new benchmark in the field of image retrieval.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。