Data Visualization Semantics

A few days ago I had this nice chat with Jon Schwabish while sipping some iced tea at Think Coffee in downtown Manhattan: what elements of a graphic design give meaning to a visualization? How does the graphical marks, their aesthetics, and their contextual components translate into meaningful concepts we can store in our head?

Everything started from us discussing the role of text in visualization and how labels and annotations play a big role in this sense. Try to think about visualization with no text at all: where does the meaning come from?

I think interpretation depends at least on these two main factors: (1) background knowledge in the reader and (2) semantic cues in the graphics. Interpretation is a sort of “dance” between these two elements: what we have in our head influences what we see in the graphics (this is a very well known fact in vision science) and what we see in the graphics influences what we think.

Background Knowledge. No interpretation can happen if we do not connect what we see with information that is already stored in our head. That’s the way Colin Ware puts it in his “Visual Thinking for Design“:

“When we look at something, perhaps 95 percent of what we consciously perceive is not what is “out there” but what is already in our heads in long-term memory. The reason we have the impression of perceiving a rich and complex environment is that we have rich and complex networks of meaning stored in our brains” [Ch.6, p.116]


“… we have been discussing about objects and scenes as pure visual entities. But scenes and objects have meaning largely through links to other kinds of information stored in a variety of specialized regions of the brain” [Ch.6, p.114]

We are so fixated with data today that we end up forgetting data is merely a (dry) representation of a much more complex phenomenon, and that people need to have their own internal representation of this phenomenon in order to reason about it. This is independent from the data and it plays a big role on how people interpret and interacts with a visualization. Sure, one could always analyze a graph syntactically and say that something is increasing or decreasing over time, or that some “objects” cluster together, etc. But is that useful at all?

Of course, interpretation is subjective and biases pop up all the time, but how does the designer’s intent interact with all the preconceptions, biases and skills of any given reader? This is a huge topic and I don’t see anything around that can help us sorting these thing out.

Interestingly, I see two opposite cases taking place in visualization use and practice. When visualization is used mainly as a communication tool, that is, to convey a predefined message the designer has crafted for the reader, the reader has to be educated before interpretation takes place.

But when visualization is used as an exploratory or decision making tool developed for a group of domain experts, we have an opposite kind of gap: the designer is typically ignorant about the deep meaning of the data and needs to be educated before good design takes place. Without a very tight collaboration between the designer and the domain scientist it’s practically impossible to build something really useful. I have experienced that myself many times. Unfortunately, such a tight collaboration does not happen easily and it’s very hard to establish in the first place.

Semantic Cues. The way visualization itself is designed can support or hinder the semantic association between graphical elements and concepts. The minimum requirement is that the user understands how the graphics works and what it represents. Some charts are easier to interpret because people are familiar with them, some others are fancier and need additional explanations.

But even when a chart is familiar, explanations are needed to understand what the graphical objects represent. I have seen this problem so many times in presentations, especially when some fancy visualization technique is used: the presenter does not describe the semantic associations well enough and the audience gets totally lost.

Other than showing trends and quantities visualization needs to make clear how to create a mental link between the objects stored in your head and those perceived in the visualization: the “what”, “who”, “where”, elements. The theory of visual encoding is so heavily based on the accurate representation of quantitative information that it seems like we have totally forgotten how important it is to employ effective encodings for the what/where channels. This is perhaps why visualization of geographical data is often on a map. Keeping the geographical metaphor intact might not be the “best” visual encoding for the task at hand, yet it carries such a high degree of semantics that it’s hard to shy away from it.

Finally, going back to the original idea of this post, text is king when we talk about interpretation. Seriously, think about visualization with and without text. Text makes visualization alive. It gives meaning to what you see. Among the most common textual elements you can find in a visualization there are: axes labels, legend labels, item labels, titles, annotations, but I guess there are many unused/under-researched aspects. Labeling is tricky and not well studied yet (except for label placement, a quite extensively developed niche). For instance, when the number of data labels shown is higher than a few units there is a high risk to clutter up the screen and no obvious solutions exist.

Also, there is a limited understanding of what’s the best way to integrate visualization and text in a much more natural and seamless way, which goes beyond simply attaching labels to objects and very well beyond the scope of this post.

And you? What do you think? Did you ever think about how meaning is conveyed in visualization? Anything to add?

Thanks for reading. Take care.

5 thoughts on “Data Visualization Semantics

  1. Riccardo Scalco

    Nice post. I think that the interactivity could help to rise the semantic complexity of a visualization on demand of the reader. In other words, the visualization starts easy and then becomes more expressive on request, through user interaction (so we can assume the user is understanding what he is doing). With a fancy expression, on the limit of abuse of language, we can use interactivity not only horizontally on the semantic space, but also vertically. Anyway, I am quite new in this field, so feel free to correct my language and give me some hint in the case. Cheers.

    1. FILWD

      Sure! I did not even touch upon the semantics of interaction. It looks pretty much like a big can of worms :) But yes, that is very relevant and interesting.

  2. yinshanyang

    Hi Enrico,

    I think you’re spot on with regards to the lack of the attention given to the role of text in interpretation.

    As you have pointed out, text ( or language ) carries with it meaning that is learned and is thus is better able to provide context to a visualisation. But another way of framing the value of text is that it has the affordance to convert abstract concepts where the visual language or imagery fails. My naïve example is that text would be appropriate in conveying the concepts of “time of day” or “day of week” or “past 28 days”, where it would be hard to represent these concepts in iconography or visual form. And through this measure I find it easier to decide on which aspects are best suited for textual representation and which aspects are best suited for visual representation.


    That said, I think another aspect worth exploring in the semantics of visualisation is the format & layout of the visualisation and how that informs ( or distorts ) the reading of the visual element, e.g. colour.

    Just like how in the context of “I smell a rat” and “the cat chased the rat”, the element ‘rat’ has a different learned meaning depending on the line, so too can visual elements carry different learned meanings depending on the context or layout.

    For example, colour in a Hans Rosling-type scatterplot could easily imply categories, as trained by past exposures, and is thus augmented by a legend; colour in a heatmap could just as easily represent value instead, and is thus augmented by a scale. And to take it further, even the form of a legend versus a scale informs us about meaning.


  3. Joyce Lee, MD, MPH

    Hi Enrico. Great article. I really appreciate this comment:

    “the designer is typically ignorant about the deep meaning of the data and needs to be educated before good design takes place. Without a very tight collaboration between the designer and the domain scientist it’s practically impossible to build something really useful. I have experienced that myself many times. Unfortunately, such a tight collaboration does not happen easily and it’s very hard to establish in the first place.”

    Do you have examples of where there is a success case for this tight collaboration? I am a pediatric endocrinologist (doctor for kids with diabetes) and I am engaged in projects to try to use interactive visualization to help individuals with diabetes discover new patterns in their disease (by studying the blood sugars and insulin dosing) and to make changes to their regimen if problems are detected. So I find myself in this situation that you describe.

    Also, the combination of words with data visualization; do you have examples of effective ones that integrate visuals and text?

  4. Pingback: Data vizualisation – links and articles — Responsible Data Forum

Leave a Reply

Your email address will not be published. Required fields are marked *