IEEE VIS 2024 Content: PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation

PrompTHis: Visualizing the Process and Influence of Prompt Editing during Text-to-Image Creation

Yuhan Guo -

Hanning Shao -

Can Liu -

Kai Xu -

Xiaoru Yuan -

Room: Bayshore I

2024-10-16T17:00:00ZGMT-0600Change your timezone on the schedule page
2024-10-16T17:00:00Z
Exemplar figure, described by caption below
When using text-to-image generative models, users might spend a lot of time in trials and errors. PrompTHis is a visual interactive system that supports users to understand how the models work through exploring prompt history. It consists of a novel Image Variant Graph presents how specific word modifications affect the model's outputs and a history box that shows the attempts in temporal order. The figure shows the prompting records of an artist. Starting from a black-and-white drawing of city buildings (1-5), the artist experimented with color styles (6-7, 8-10), and returned to the black-and-white style (11-14), with “atomic explosion” inserted later (15).
Fast forward
Keywords

Text visualization, image visualization, text-to-image generation, editing history, provenance, generative art

Abstract

Generative text-to-image models, which allow users to create appealing images through a text prompt, have seen a dramatic increase in popularity in recent years. However, most users have a limited understanding of how such models work and often rely on trial and error strategies to achieve satisfactory results. The prompt history contains a wealth of information that could provide users with insights into what has been explored and how the prompt changes impact the output image, yet little research attention has been paid to the visual analysis of such process to support users. We propose the Image Variant Graph, a novel visual representation designed to support comparing prompt-image pairs and exploring the editing history. The Image Variant Graph models prompt differences as edges between corresponding images and presents the distances between images through projection. Based on the graph, we developed the PrompTHis system through co-design with artists. Based on the review and analysis of the prompting history, users can better understand the impact of prompt changes and have a more effective control of image generation. A quantitative user study and qualitative interviews demonstrate that PrompTHis can help users review the prompt history, make sense of the model, and plan their creative process.