Data Visualization Semantics

A few days ago I had this nice chat with Jon Schwabish while sipping some iced tea at Think Coffee in downtown Manhattan: what elements of a graphic design give meaning to a visualization? How does the graphical marks, their aesthetics, and their contextual components translate into meaningful concepts we can store in our head?

Everything started from us discussing the role of text in visualization and how labels and annotations play a big role in this sense. Try to think about visualization with no text at all: where does the meaning come from?

I think interpretation depends at least on these two main factors: (1) background knowledge in the reader and (2) semantic cues in the graphics. Interpretation is a sort of “dance” between these two elements: what we have in our head influences what we see in the graphics (this is a very well known fact in vision science) and what we see in the graphics influences what we think.

Background Knowledge. No interpretation can happen if we do not connect what we see with information that is already stored in our head. That’s the way Colin Ware puts it in his “Visual Thinking for Design“:

“When we look at something, perhaps 95 percent of what we consciously perceive is not what is “out there” but what is already in our heads in long-term memory. The reason we have the impression of perceiving a rich and complex environment is that we have rich and complex networks of meaning stored in our brains” [Ch.6, p.116]

And:

“… we have been discussing about objects and scenes as pure visual entities. But scenes and objects have meaning largely through links to other kinds of information stored in a variety of specialized regions of the brain” [Ch.6, p.114]

We are so fixated with data today that we end up forgetting data is merely a (dry) representation of a much more complex phenomenon, and that people need to have their own internal representation of this phenomenon in order to reason about it. This is independent from the data and it plays a big role on how people interpret and interacts with a visualization. Sure, one could always analyze a graph syntactically and say that something is increasing or decreasing over time, or that some “objects” cluster together, etc. But is that useful at all?

Of course, interpretation is subjective and biases pop up all the time, but how does the designer’s intent interact with all the preconceptions, biases and skills of any given reader? This is a huge topic and I don’t see anything around that can help us sorting these thing out.

Interestingly, I see two opposite cases taking place in visualization use and practice. When visualization is used mainly as a communication tool, that is, to convey a predefined message the designer has crafted for the reader, the reader has to be educated before interpretation takes place.

But when visualization is used as an exploratory or decision making tool developed for a group of domain experts, we have an opposite kind of gap: the designer is typically ignorant about the deep meaning of the data and needs to be educated before good design takes place. Without a very tight collaboration between the designer and the domain scientist it’s practically impossible to build something really useful. I have experienced that myself many times. Unfortunately, such a tight collaboration does not happen easily and it’s very hard to establish in the first place.

Semantic Cues. The way visualization itself is designed can support or hinder the semantic association between graphical elements and concepts. The minimum requirement is that the user understands how the graphics works and what it represents. Some charts are easier to interpret because people are familiar with them, some others are fancier and need additional explanations.

But even when a chart is familiar, explanations are needed to understand what the graphical objects represent. I have seen this problem so many times in presentations, especially when some fancy visualization technique is used: the presenter does not describe the semantic associations well enough and the audience gets totally lost.

Other than showing trends and quantities visualization needs to make clear how to create a mental link between the objects stored in your head and those perceived in the visualization: the “what”, “who”, “where”, elements. The theory of visual encoding is so heavily based on the accurate representation of quantitative information that it seems like we have totally forgotten how important it is to employ effective encodings for the what/where channels. This is perhaps why visualization of geographical data is often on a map. Keeping the geographical metaphor intact might not be the “best” visual encoding for the task at hand, yet it carries such a high degree of semantics that it’s hard to shy away from it.

Finally, going back to the original idea of this post, text is king when we talk about interpretation. Seriously, think about visualization with and without text. Text makes visualization alive. It gives meaning to what you see. Among the most common textual elements you can find in a visualization there are: axes labels, legend labels, item labels, titles, annotations, but I guess there are many unused/under-researched aspects. Labeling is tricky and not well studied yet (except for label placement, a quite extensively developed niche). For instance, when the number of data labels shown is higher than a few units there is a high risk to clutter up the screen and no obvious solutions exist.

Also, there is a limited understanding of what’s the best way to integrate visualization and text in a much more natural and seamless way, which goes beyond simply attaching labels to objects and very well beyond the scope of this post.

And you? What do you think? Did you ever think about how meaning is conveyed in visualization? Anything to add?

Thanks for reading. Take care.