We all know by now that visualization, thanks to its amazing communication powers, can be used to communicate effectively and persuasively massages that stick into people’s mind. This same power, however, can also be used to mislead and misinform people very effectively! When techniques like non-zero baselines, scaling by area (quadratic change to represent linear changes), bad color maps, etc., are used, it is very easy to communicate the wrong message to your readers (being that done on purpose or for lack of better knowledge). But, how easy is it?
How easy is it to deceive people with visualization?
This is the main question of the work we have recently published at ACM CHI’15 named “How Deceptive Are Deceptive Visualizations: An Empirical Analysis of Common Distortion Techniques“. This has been developed in my lab at NYU with our amazing students Anshul Vikram Pandey, Katharina Rall and my colleagues Meg Satterthwaite and Oded Nov.
The deception/distortion problem is of course very well known. Tufte has whole chapters on ”Junk Charts” and the “Lie Factor” in “The Visual Display of Quantitative Information“. The super classic “Hot to Lie with Statistics” has lots of interesting examples with graphs and stats. You may also want to read “How to Lie with Maps” and “How to Lie with Chats” and the excellent article “Disinformation Visualization: How to lie with datavis” (hosted by the excellent Visualizing Advocacy).
All these sources discuss the problem in details and provide many examples, but we were surprised by the lack of experimental work in this area. To test the deception effect we therefore created an experiment.
For the experiment we selected a series of well-known graphical distortion techniques and developed a study on Amazon Mechanical Turk with hundreds of participants to test the deception effect.
We use techniques such as:
When the bars do not start from zero (right) the difference looks much bigger.
When the axis is inverted (right) the values grow towards the bottom and thus mislead the reader.
The ratio can be manipulated to give the impression a quantity grows much faster/slower than it does.
When a quantity is mapped to the radius, the perceived quantity is the area of the bubble which grows quadratically.
How is the deceptive effect measured?
We present half of the participants with a deceptive version of the chart and half with the not deceptive one. Then, we ask the same question to both and measure the difference between the two. Let me give you an example.
We show bar charts similar to those shown above and say that they represent the percentage of population with access to drinking water in Willowtown and Silvatown (yes we use fake names, see why in the paper). One version has a truncated axis, the other does not. Then we ask: “How much better are the drinking water conditions in Willowtown as compared to Silvatown?“.
The participants do not have to provide a precise number, rather they have to provide a gross estimation of the effect using a 5-item Likert scale, ranging from “slightly better” to “substantially better“. We do the same with other charts and data, using slightly different questions and then compare for each case how the two versions compare. Here are the results.
The results, probably not surprisingly by reassuringly look like this:
How do you read this? This is a comparison of the average response with confidence intervals. The average is the dot, the confidence interval is the hinges you see around the dot (confidence intervals show plausible values, that is where the dot may fall). Each color represents one chart type: line, bubble, bars (we have inverted axis also in the paper but it is not shown here).
For each chart the control, that is the chart with not deceptive effect, should lead to a smaller estimate in the response. When I ask you “How much better are the drinking water conditions in Willowtown as compared to Silvatown?” the control chart should show that the difference is not that big. But when I show you the same data with a truncated axis, the difference between the bars looks bigger and this should lead to a much bigger estimate. This is exactly what you can see in the chart above. For each condition the deceptive version always leads to bigger estimates. You can also see that line charts and bar charts seem to have a more dramatic effect than bubbles.
The paper shows other aspects of the study. For instance, we tried to see whether this effect is modulated by individual differences like gender, age, education level, etc., but we failed to find these differences (this of course does not mean they do not exist).
What are the implications of the study? It think it very simply indicates that this effect is real and can be very big. So now when you talk to someone and want to show that this effect is supported by evidence you can point him/her to our paper. I think this work also traces the line for other potentially interesting future works. I think we need much more evidence-based results in visualization to support our intuitions, this is a first step in this direction and hopefully it will inspire other researchers to do the same.
Also, this work is in collaboration with top Human Rights experts (Prof. Satterthwaite at NYU Law and her student Katharina Rall) and I am excited to see our work having an impact on such an important community. Human Rights has a constant need to communicate messages effectively, with little bias and persuasively. This is a tiny new brick in their toolbox and this is very exciting!