IEEE VIS 2024 Content: Exploring the Capability of LLMs in Performing Low-Level Visual Analytic Tasks on SVG Data Visualizations

Exploring the Capability of LLMs in Performing Low-Level Visual Analytic Tasks on SVG Data Visualizations

Zhongzheng Xu - Brown University, Providence, United States

Emily Wall - Emory University, Atlanta, United States

Screen-reader Accessible PDF

Room: Bayshore VI

2024-10-17T18:48:00ZGMT-0600Change your timezone on the schedule page
2024-10-17T18:48:00Z
Exemplar figure, described by caption below
The image is an illustration of the study design of the paper Exploring the Capability of LLMs in Performing Low-Level Visual Analytic Tasks on SVG Data Visualizations. This figure consists of three main components: Plot Type, Plot Difficulty, and Low-level Visual Analytics Tasks. Plot Types include Scatter, Line, and Bar charts, all in SVG format. Plot Difficulty is divided into Small Labeled, Small Unlabeled, Medium Labeled, and Medium Unlabeled, with 20 sets of each type. Low-level Visual Analytics Tasks include Retrieve Value, Filter, Compute Derived Value, Find Extremum, Sort, Determine Range, Characterize Distribution, Find Anomalies, Cluster, and Correlate.
Fast forward
Keywords

Data Visualization, Large Language Models (LLM), Visual Analytics Tasks, Support Vector Graphics (SVG)

Abstract

Data visualizations help extract insights from datasets, but reaching these insights requires decomposing high level goals into low-level analytic tasks that can be complex due to varying degrees of data literacy and visualization experience. Recent advancements in large language models (LLMs) have shown promise for lowering barriers for users to achieve tasks such as writing code and may likewise facilitate visualization insight. Scalable Vector Graphics (SVG), a text-based image format common in data visualizations, matches well with the text sequence processing of transformer-based LLMs. In this paper, we explore the capability of LLMs to perform 10 low-level visual analytic tasks defined by Amar, Eagan, and Stasko directly on SVG-based visualizations. Using zero-shot prompts, we instruct the models to provide responses or modify the SVG code based on given visualizations. Our findings demonstrate that LLMs can effectively modify existing SVG visualizations for some tasks like Cluster but perform poorly on tasks requiring mathematical operations like Compute Derived Value. We also discovered that LLM performance can vary based on factors such as the number of data points, the presence of value labels, and the chart type. Our findings contribute to gauging the general capabilities of LLMs and highlight the need for further exploration and development to fully harness their potential in supporting visual analytic tasks.