Abstract
The research on 3D model reconstruction from a single image using deep learning technology has achieved remarkable progress. However, compared with images, sketches lack sufficient visual information, which challenges the reconstruction algorithm's ability to correctly interpret sketches. Herein, we introduce a streamlined network architecture for sketch-to-3D mesh generation, designed to address the challenge of reconstructing high-fidelity 3D models from single-hand sketches. Our approach deploys the expressive PowerMLP architecture within an encoder-decoder framework, surpassing traditional MLP implementations in representation capability. By integrating 3D shape constraints instead of relying on conventional discriminators, we achieve geometric fidelity in a collaborative generation process. Experimental results demonstrate state-of-the-art (SOTA) performance on both synthetic stylized sketches and real-world handwritten inputs, validating the method's robustness and adaptability.