IEEE VIS 2024 Content: Beware of Validation by Eye: Visual Validation of Linear Trends in Scatterplots

Beware of Validation by Eye: Visual Validation of Linear Trends in Scatterplots



Daniel Braun - University of Cologne, Cologne, Germany

 Remco Chang - Tufts University, Medford, United States

 Michael Gleicher - University of Wisconsin - Madison, Madison, United States

 Tatiana von Landesberger - University of Cologne, Cologne, Germany

 Download Supplemental Material

 Room: Bayshore V

2024-10-17T12:42:00ZGMT-0600Change your timezone on the schedule page
2024-10-17T12:42:00Z

Exemplar figure, described by caption below — “Visual summary” of visual validation and estimation accuracy for linear trends in scatterplots. The figure shows the true regression line (green) for OLS together with participants’ average response for estimation (blue) and the range of lines with an acceptance rate of 50% or higher for validation (orange).

Fast forward

Full Video

Keywords

Perception, visual model validation, visual model estimation, user study, information visualization

Abstract

Visual validation of regression models in scatterplots is a common practice for assessing model quality, yet its efficacy remains unquantified. We conducted two empirical experiments to investigate individuals’ ability to visually validate linear regression models (linear trends) and to examine the impact of common visualization designs on validation quality. The first experiment showed that the level of accuracy for visual estimation of slope (i.e., fitting a line to data) is higher than for visual validation of slope (i.e., accepting a shown line). Notably, we found bias toward slopes that are “too steep” in both cases. This lead to novel insights that participants naturally assessed regression with orthogonal distances between the points and the line (i.e., ODR regression) rather than the common vertical distances (OLS regression). In the second experiment, we investigated whether incorporating common designs for regression visualization (error lines, bounding boxes, and confidence intervals) would improve visual validation. Even though error lines reduced validation bias, results failed to show the desired improvements in accuracy for any design. Overall, our findings suggest caution in using visual model validation for linear trends in scatterplots.