IEEE VIS 2025 Content: Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation

Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation

Nan Xiang -

Tianyi Liang -

Haiwen Huang -

Shiqi Jiang -

Hao Huang -

Yifei Huang -

Liangyu Chen -

Changbo Wang -

Chenhui Li -

Image not found
This work primarily targets practitioners in 3D content creation, including digital artists, industrial designers, and CAD software developers. The proposed system (Sel3DCraft) offers valuable practical applications, as professionals can adopt its AI-assisted selection paradigm to reduce repetitive operations and gain inspiration. Furthermore, the comparative study results provide evidence-based insights for designing more intuitive Text-to-3D interfaces, benefiting users while informing tool development strategies across the industry.
Keywords

Prompt engineering, text-to-3D generation, shape exploration, visualization design, visual perception

Abstract

Text-to-3D (T23D) generation has transformed digital content creation, yet remains bottlenecked by blind trial-and-error prompting processes that yield unpredictable results. While visual prompt engineering has advanced in text-to-image domains, its application to 3D generation presents unique challenges requiring multi-view consistency evaluation and spatial understanding. We present Sel3DCraft, a visual prompt engineering system for T23D that transforms unstructured exploration into a guided visual process. Our approach introduces three key innovations: a dual-branch structure combining retrieval and generation for diverse candidate exploration; a multi-view hybrid scoring approach that leverages MLLMs with innovative high-level metrics to assess 3D models with human-expert consistency; and a prompt-driven visual analytics suite that enables intuitive defect identification and refinement. Extensive testing and a user study demonstrate that Sel3DCraft surpasses other T23D systems in supporting creativity for designers.