Abstract
With the rapid development of large generative models for 3D content, image-to-3D and text-to-3D generation has become a major focus in computer vision and graphics. Single-view 3D reconstruction, in particular, offers a convenient and practical solution. However, the way to automatically choose the best image from a large collection to optimize reconstruction quality and efficiency is very important. This paper proposes a novel image selection framework based on multi-feature fusion quadtree structure. Here, we introduce a new image selection method based on a multi-feature quadtree structure. Our approach integrates various visual and semantic features and uses a hierarchical quadtree to efficiently evaluate image content. This allows us to identify the most informative and reconstruction-friendly image from large datasets. We then use Tencent's Hunyuan 3D model to verify that the selected image improves reconstruction performance. Experimental results show that our method outperforms existing approaches across key metrics. Baseline methods achieved average scores of 6.357 in Accuracy, 6.967 in Completeness, and 6.662 Overall. Our method reduced these to 4.238, 5.166, and 4.702, corresponding to an average error reduction of 29.5%. These results confirm that our approach reduces reconstruction errors, improves geometric consistency, and yields more visually plausible 3D models.