InfoVis Course Diary: Basic Charts Need to Be Learned First

In the third week of my course I introduce fundamental charts. These are charts that are super common and most people are familiar with: bar charts, histograms, line charts, scatter plots, heat maps, etc. This is a major departure from the textbook I use, which introduces methods to visualize tabular data only in later chapters.

Why do I start with basic charts?

It’s pedagogically better to start with basic rather than advanced charts. I truly believe one must first learn how to use basic charts properly before moving on to other more exotic territories. Within their constrained space, there is a lot to learn and there are infinite variations and tweaks one can apply.

These charts, in their most basic format, cover all possible combinations of two attributes, thus giving students a manageable and yet powerful mental model to think about how pairs of attributes can be combined: a scatter plot is made of 2 quantitative attributes; a bar chart is made of 1 categorical and 1 quantitative attribute; a line chart is also made of 2 quantitative attributes but one is special as it represents time;  a heat map is a combination of two categorical attributes and their frequencies, etc.

These charts are also infinitely “tweakable” while retaining simplicity. For instance, what happens to these charts when you need to map an additional 3rd attribute? In scatter plots you can use color and/or size. In bar charts you can use stacked or grouped bars. In line charts you can add multiple lines.

Take this scatter plot below in which I am plotting data from the USDA food nutrients database (each dot is a food, axes are: amounts of mono-saturated and poly-saturated fats).

Scatter plots are infinitely “tweakable”. Here we progressively encode 2, 3 and 4 attributes with position, color hue and size.

In its most basic format it encodes two quantitative attributes (the amount of two type of fats). The next one encodes a third categorical attribute (food type) using color hue. And the final one encodes a fourth attribute (amount of water) with size. That’s a very gentle and yet solid way of introducing fundamental visualization concepts. In class I am actually cycling through many of these examples, starting from the most basic chart and asking students to accommodate more data or needs.

With these basic charts one can also start introducing examples of problematic design solutions. For instance, it’s easy to talk about the “truncated axis” and the “dual axis” problems.

See, for instance, the infamous Planned Parenthood hearing charts that made the news last year.

Misleading chart in which the dual-axis method has been used to give a false impression about the data.

Another aspect is that basic charts are an amazing toolbox for the visualization designers because the very large majority of existing problems can be solved with them. It’s very rare to find a visualization problem that cannot be solved, at least in a first approximation, with these basic designs. Plus, whenever needed, you can always try to apply little modifications to conform them to your needs. I really believe there are way more interesting designs one can generate by tweaking these basic charts than trying to come up with something entirely new.

Finally, basic charts are the most familiar ones. If in your project familiarity is an important aspect you don’t want people to spend time figuring out how to decode your charts. This is an aspect that is often overlooked in visualization. When presented with a new visualization the first thing the reader/user needs to figure out is “how do I actually read this?“.

Some questions from the students …

Before I conclude I want to briefly touch upon a few recurring questions students asked in their reading response exercise (I’m paraphrasing).

Q: “Are pie charts really so bad? I like them!

Students are always puzzled when they see that pie charts are so heavily criticized by some people. I am personally not interested at all about the pie charts debate. I am not because I do not think it’s really consequential. In any case what I tell my students is that they should never use rules blindly. So pie charts are a good example to exercise their own good judgment. When are they appropriate? When are they not? This is what matters the most to me. I am myself not a big fan of pie charts but I do not believe they are evil, and I do believe there situations in which it may be reasonable to use them. Robert Kosara has published a good number of interesting blogs posts and papers on the topic.

Q: “How do we deal with people’s subjective preference for some charts?

That’s actually a big one. Students at the beginning of the course are still puzzled by the idea that some charts may be objectively better than others in some contexts and for some tasks. Because of that, they feel lost when they think that some people reading their charts may actually find them not exactly of their favorite “taste”.

This is too long a subject to be developed fully here. But the main thing to learn is that charts need to be effective before being “pleasurable”. Aesthetics plays a role of course but it cannot subsume effectiveness. Therefore one needs to learn what is effective and what is not and then find a way to “inject” the right aesthetic sense within these constraints.

The best data visualizations out there (and the best designers by the way) are those that are great at finding this fine balance.

But there is more to say on the topic. A very important principle, dear to user experience designers, is that good design is about giving people what they need, not necessarily what they tell you they need. It’s your responsibility as a designer and as an engineer to figure out what is the best solution for your audience, you can’t rely exclusively on what they tell you they want or need.

Q: “How do we deal with large data sets? These charts do not scale!

Correct. Many of the basic charts do not scale to large data sets or large number of values. But this is exactly the point! By realizing what the limitations of basic charts are, one is forced to think about what the alternatives are. It’s a very useful step from the pedagogical point of view.

But even before moving to more exotic solutions, students first need to figure out how scalable basic charts really are. A very consistent trend I found in students is that they are afraid of making charts small and, because of that, they largely underestimate how scalable they really are!

As Tufte’s work on “sparklines” demonstrates, charts can be incredibly small and still convey a lot of information. So, while there are cases in which standard charts may actually not be sufficiently scalable, I consider it crucial to first show students how to make charts small. Very small.

That’s all for now. Hope this helps!