IEEE VIS 2025 Content: Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation

Sel3DCraft: Interactive Visual Prompts for User-Friendly Text-to-3D Generation



 Nan Xiang -

 Tianyi Liang -

 Haiwen Huang -

 Shiqi Jiang -

 Hao Huang -

 Yifei Huang -

 Liangyu Chen -

 Changbo Wang -

 Chenhui Li -

 Download preprint PDF

Room: Hall E1

2025-11-05T13:48:00.000ZGMT-0600Change your timezone on the schedule page
2025-11-05T13:48:00.000Z

Recorded video from this session can be viewed at the following link.
https://youtu.be/zLchh59XyM4

Keywords

Prompt engineering, text-to-3D generation, shape exploration, visualization design, visual perception

Abstract

Text-to-3D (T23D) generation has transformed digital content creation, yet remains bottlenecked by blind trial-and-error prompting processes that yield unpredictable results. While visual prompt engineering has advanced in text-to-image domains, its application to 3D generation presents unique challenges requiring multi-view consistency evaluation and spatial understanding. We present Sel3DCraft, a visual prompt engineering system for T23D that transforms unstructured exploration into a guided visual process. Our approach introduces three key innovations: a dual-branch structure combining retrieval and generation for diverse candidate exploration; a multi-view hybrid scoring approach that leverages MLLMs with innovative high-level metrics to assess 3D models with human-expert consistency; and a prompt-driven visual analytics suite that enables intuitive defect identification and refinement. Extensive testing and a user study demonstrate that Sel3DCraft surpasses other T23D systems in supporting creativity for designers.