IEEE VIS 2025 Content: VisAnatomy: An SVG Chart Corpus with Fine-Grained Semantic Labels

VisAnatomy: An SVG Chart Corpus with Fine-Grained Semantic Labels

Chen Chen -

Hannah Bako -

Peihong Yu -

John Hooker -

Jeffrey Joyal -

Simon Wang -

Samuel Kim -

Jessica Wu -

Aoxue Ding -

Lara Sandeep -

Alex Chen -

Chayanika Sinha -

Zhicheng Liu -

Image not found

Room: Hall M2

Keywords

Chart, SVG, data visualization, corpus, dataset, multilevel fine-grained semantic labels

Abstract

Chart corpora, which comprise data visualizations and their semantic labels, are crucial for advancing visualization research. However, the labels in most existing corpora are high-level (e.g., chart types), hindering their utility for broader applications in the era of AI. In this paper, we contribute VISANATOMY, a corpus containing 942 real-world SVG charts produced by over 50 tools, encompassing 40 chart types and featuring structural and stylistic design variations. Each chart is augmented with multi-level fine-grained labels on its semantic components, including each graphical element’s type, role, and position, hierarchical groupings of elements, group layouts, and visual encodings. In total, VISANATOMY provides labels for more than 383k graphical elements. We demonstrate the richness of the semantic labels by comparing VISANATOMY with existing corpora. We illustrate its usefulness through four applications: semantic role inference for SVG elements, chart semantic decomposition, chart type classification, and content navigation for accessibility. Finally, we discuss research opportunities to further improve VISANATOMY.