Demystifying Cargo Cult Visualization: You Cannot Visualize 3 Variables by Mixing 3 Colors

by Enrico on January 21, 2011

in Guides,Thoughts

cargo cultYou may have noticed last week, there was a spike of interest around a “new” visualization technique proposed by the GOOD Magazine in which 3 colors are used to represent 3 aspect of a demographic data set. I originally answered to a question posted in Twitter by Moritz Stefaner in which he asked what we thought about it. Surprisingly the whole stuff spread like a virus and new blog posts popped up here and there and people came up with every kind of sophisticated explanations and arguments about what is good, what is bad, what could be good if, what could be done better, etc … If your radar didn’t catch these signals take a look to the very-well-crafted Andy Kirk’s post which pretty much summarizes the whole thing.

I won’t make any discount here: in my humble opinion this is plain BS … or better it is what I call Cargo Cult Visualization. I’ll describe what I mean with this term, how visualization theory predicts that the technique is plain wrong and why you’d better study some basic theory before attempting new “inventions”.

Why you cannot visualize 3 variables with 3 colors

Of course I have nothing against experimentation and it’s totally fine to explore crazy ideas with the purpose of learning something new out of it. But here we have a new technique sold as an invention when in fact the technique is not new and science predicts it doesn’t work.

When I originally read the invitation of Moritz to comment on this technique all in a sudden it reminded me of a couple of pages from Colin Ware’s Information Visualization book. The theory behind it is called Integral-Separable Dimensions and it explains why this cannot work.

Integral-Separable Dimensions Theory

Colin Ware (at page 177) explains the theory and why it is important in visual encoding of data dimensions. When we build visualizations we map data features to visual features (size, color, shape, etc.) and we expect to see similarities and relationships between these objects visually. The problem is that the choice of which visual features are used in conjunction to encode the various data features greatly affects the way their are perceived. All features influence each other to some extent but some more than others. For instance, if you use color and size to encode two data features, the way color is perceived will be affected by the size of the object (other then a number of other contextual factors). It turns out that this effect has been the object of several studies in vision science and we know quite well how certain features interact.

We say that two dimensions (features) are integral when they are perceived holistically, that is, it’s hard to visually decode the value of one independently from the other. Picture a series of rectangles in a scatter plot where the height is mapped to one data feature and the width to another:

integral separable dimensions

can you easily spot all the rectangles with the same width (click on the images for a larger version)? No, it’s not fast.

And what about all those with the same height? The same.

Here are those with the same width:

integral separable dimensions

And here those with the same height:

integral separable dimensions

On the contrary if you use color and size the task is easier.

integral separable dimensions

You can more easily spot yellow or black dots. And you can also sport circles or squares. It’s not super fast but it’s better, right? Shape and color are in fact more separable than width and height.

Again in the book you can find a clarifying example (at page 181) (this is actually the picture that came into my mind first). The dimensions are ordered from the most integral, at the top, to the most separable, at the bottom.

Integral-Separable Dimension from Colin Ware's Book

You notice anything? What is a the top? Color channels. You see, it takes reading a couple of pages to demystify a cargo cult technique. And what the map in GOOD magazine proposes is not even what is suggested here by Colin Ware because Colin suggests using some specific color channels.

How the idea could be implemented better

So the problem with this map is not only the choice of using 3 colors for 3 dimensions but also the bad execution of the idea. Color theory in fact teaches us that colors can be described by 3 channels. There exist a fairly large number of ways to describe color and they all describe color with 3 channels. Why the authors didn’t try to use at least one of those? Some of them won’t work anyway but at least it would make more sense. Also, two of the original data features, high school graduates and  college graduates, could be easily combined to answer their question and then more visual options would be available. And color could still be not optimal but interesting to explore!

An alternative experiment that could run is based on the use of the Opponent Process Theory (sorry I know I am getting too technical here but I don’t want to impress you, it’s just that I want to demonstrate how experimentation should be guided by knowledge). I won’t explain the theory in details, again you can find it in Colin Ware’s book (page 110). The theory says a simple thing: our visual system and its internal circuitry is made in a way that we naturally have 3 embedded channels: yellow-blue, red-green, black-white. If these are our natural channels why not trying with them? That would be interesting to explore! I wouldn’t expect to have an enlightening map out of it but at least this is worth trying.

Thanks to Alan MacEachren I also  discovered that Cynthia Brewer, one of the major experts in color use for data visualization (if you don’t know her ColorBrewer go there NOW), actually tried a similar scheme in the mid-90′s. I have found an example online (thanks to Robert Roth who posted it) which I repeat here below.

Brewer (1994: 133) on Twitpic

The map uses a trivariate color mapping to show the “percent of labor force employed in each of the three sectors”. At first sight it might seem like the same idea. Just that it is proposed by one of the most authoritative person in the field. But it’s not, it is fundamentally different. In this case, the three dimensions-colors represent the proportion of the same variable (i.e., labor force) along there different categories (services, agriculture, industry). The mappings is in turn fundamentally a categorical mapping and you still need to refer back to the legend in order to accurately decode the map.

What is cargo cult visualization and why we don’t need it

Really guys … to the risk of appearing academic or orthodox or whatever: we don’t need cargo cult visualization. But what is cargo cult visualization? Famous physicist Richard Feynman coined the term cargo cult science in a famous lecture … (source Wikipedia)

” … to negatively characterize research in the soft sciences (psychology and psychiatry in particular) – arguing that they have the semblance of being scientific, but are missing “a kind of scientific integrity, a principle of scientific thought that corresponds to a kind of utter honesty”.

I immediately thought about it when I created the concept of this post. Cargo cult vis is not just junk charts, it’s more insidious. In chart junks there is “only” the bad or creative use of standard charts in ways that basically hide the message behind the glitter. But here we have a more courageous step: a method proposed like if it was new when in fact it is not new at all and it’s badly executed. Cargo cult visualization is trying to invent new techniques without having any minimal knowledge of the basics. That’s dangerous and can deceive novices who are interested in visualization.

If you read this blog regularly you know how much I care to give honest and solid information to people who want to become data visualization experts. And I am a big big fan of experimentation and of putting things into practice from day one. But if you are ambitious and want to come up with new techniques you’d better watch out and do your homeworks if you don’t want to shoot yourself in the foot. There are some classic knowledge sources in infovis and it doesn’t take much to acquire the basics.

I know I might appear elitist or even arrogant but self-celebration is not my intent here, I just want to make sure the point that knowledge is pretty much accessible and there are no excuses if you are too lazy and pretend to make great things. Just that. If you don’t do your little homework you are just walking in the dark.

… but don’t get me wrong: bad ideas are a fabulous tool!

There is one last thing that I want to clarify because I think it is useful. This post of mine might give the impression that I think that trying out bad ideas is bad. No, no, no! To the contrary, deliberately experimenting with mechanisms you know are clearly wrong is not bad. I learned this lesson several years ago from my dear friend and renown professor of HCI, Alan Dix, who once taught me the value of creating bad ideas deliberately. I suggest you to give a look this his page about bad ideas generation and their role in design. However the whole thing boils down to the fact that when we create bad ideas deliberately we free ourselves from judgment and criticism and learn new aspects of the problem we are trying to solve. I remember very clearly a couple of great bad idea: the glass hammer and the inflatable dart board :-) But bad idea generation, in order to be useful, has to be done purposely. I doubts this is the case for the map we have analyzed here.

I really hope this post was useful to better understand what’s the value of basic knowledge in visualization and to convince people that acquiring it is actually really worth it and necessary.

Of course it is totally possible I missed something or that I am plain wrong on something. If this is the case please let me know what you think and send me a message on Twitter or comment below, I’d love to hear you.

(If you liked this post please remember to share it by using the twitter or facebook buttons, thanks!)

  • Luc

    I know nothing about data visualisation and graphism ( i have a photography background ) but i came accross the map you talk about by hazard and my first reaction was “you have to be a graphist to use color theory on a map to show data, that’s dumb nobody know this sort of things” and i want to thank you for this article who explain in details why this can’t work, i have learn a few thinks.

  • http://www.visualisingdata.com Andy Kirk

    This type of excellent analysis is why I termed you a ‘grandee’ Enrico!

  • http://jeromecukier.net jerome cukier

    Enrico, I didn’t like this map for the reasons you outlined and for others too. (in short: I understand the question it is trying to answer but this is both the wrong dataset and the wrong representation). But I would disagree with your conclusion.
    we need designers to keep on experimenting even if this ends up like this.
    in an ideal world they would all be experts on vision theory and would have enough skill to know when to bend the rules. failing that (although data designers SHOULD learn vision theory among others!) trying something different is already doing a service to the data visualization community.

    I can be easily convinced that this map was bound to fail. Then again, there are many visualizations that are sticking to tried-and-true, orthodox principles which fail to engage users and to inform.
    my point is, bad ideas can also be productive even when unintentional.

    • Enrico

      Jerome, sorry only now I realized I missed your comment! I am really sorry. I think part of my answer is already contained in my other comments.

      I find it interesting when you say: “bad ideas can also be productive even when unintentional”. I partly agree. It can be productive when it helps people think about a problem in unconventional ways. But this requires a good bunch of educated people who can turn a bad idea into inspiration. I have several doubts we can described the large majority of potential visualization consumers educated. And this is the gist of my post. Visualization is in a very early stage if we consider its wide spread/mainstream adoption and it very “fragile”. In 5 or 10 years or I don’t know how long we will have the luxury of taking bad ideas and turning them into great ones. In the present moment I have the feeling they can do some harm. Even if it is unintentional.

  • Enrico

    @Andy Oh oh … we are in a kind of love affair then! ;-) Thanks … this kind of input is what I really need to find the energy and time to write these posts. It’s damn hard! Been writing until 2am last night.

    @Luc It’s great to hear you learned something out of it, that’s the goal! We need photography experts here. Come back visiting FILWD if you like.

  • http://moritz.stefaner.eu Moritz Stefaner

    Good stuff, thanks for making the effort and compiling all the resources. I am totally with you, except for the cargo cult paragraph. Why? As far as I can tell, the authors of the infographics never claimed their method was novel or superior. Their goal was just to create a visually strong, captivating infographic. Second, I know from my own experience that in these types of jobs readability or more general “classical infovis fu” is only one of the constraints you are working with. So there might be a chance that the authors were aware of the limitations of the solution they used, but went with it nevertheless. This is a much larger topic, but I am ready to defend the claim that good design is not necessarily only about making things easy, convenient or going for the established path. Sometimes, there are good reasons to bend the rules.

    Anyways, I totally agree that “knowledge is pretty much accessible and there are no excuses if you are too lazy and pretend to make great things. Just that. If you don’t do your little homeworks you are just walking in the dark.” and I think discussions like these help tremendously in teaching the fundamentals. So thanks again for the write-up!

  • Enrico

    Moriz, thanks a lot for your comment! And for starting this out.

    It’s true, the authors didn’t claim anything about the novelty or effectiveness of the method. Maybe my paragraph was a bit strong (and maybe I bended the reality a little bit myself). Nonetheless, when you publish something on the web you have to take the responsibility for what you show, especially if lots of people follow you. Even if you don’t say a word about it, your images speak for themselves and send a message. And in this case the message is pretty strong: try out some stuff without thinking about it too much. It’s just fun to throw colors on the screen. I saw the danger of shallowness and wanted to be sure to send a clear message.

    I don’t think the authors had any bad intentions of course, but there are only two options:

    (1) the authors know their design is weak and don’t care, and don’t even mention it – in this case this they demonstrate they are not afraid of deceiving people.
    (2) the authors don’t know their design is basically BS and publish it online thinking it is cool – in this case they demonstrate their presumption.

    Which is worse?

    … regarding your sentence “I know from my own experience that in these types of jobs readability or more general “classical infovis fu” is only one of the constraints you are working with” … mmm … if this is a piece of art fine, I’d love to have it in my living room. But if it is supposed to convey information I cannot see how readability can be bended in any ways.

  • http://nvac.pnl.gov Ian Roberts

    Hi Enrico, thanks for writing these great posts. We’ve been passing them around in our internal vis design blog.

    To throw my hat in the Cargo Cult thought, as a user interface designer, I would offer up that perhaps the creators of the visualization were only thinking about how it looks, but not how it functions. I see new interactive visualizations all the time that look really cool, but my worldview is user-focused. What question is this visualization supposed to answer? Can a reasonable person get the information from this presentation better than simply looking at a table of numbers? I think it’s important to have a list of specific user goals and tasks made before the visualization is started; then validate the design by having someone else achieve those goals.

    Thanks again for your blog.

  • http://moritz.stefaner.eu Moritz Stefaner

    Thanks for your comments, Enrico, basically, I am totally with you given the critique of the technique used, but let us be careful not to overgeneralize. I should try and collect my thoughts on the “designerly point of view” on infographics/infovis, and maybe come back with a longer comment or a blog post over at well-formed-data.

  • Enrico

    @Ian this is indeed often the problems in ui design. In visualization it becomes extreme as the artistic side often slips in. Artistic visualization is great, I love it, really! But every form has to follow its purpose. Tables are indeed often better and I would say that a real test for vis design is to ask yourself, how can I do it without vis? We often do the same exercises in our group when justifying a visual analytics solution. We ask: can we do it automatically? If the answer is yes, visualization is not needed because it involves at least one human and humans cost a lot.

    @Moritz please do it! I’d love to see your point expressed in a longer blog post. I agree with the overgeneralization problem too. Believe me, I am aware of the issue of being too academic or strict or whatever but I also know that sending watered-down messages doesn’t work. better to get some critics and let everyone learn.

  • Anonymous

    Enrico,

    I find it difficult to accept your criticism of how others are not putting due diligence into their work online, when you yourself don’t put the due diligence into your spelling/grammar. It would be easier to take you seriously if I didn’t have to mentally correct your mistakes each time I stumbled upon one.

    The knowledge of English spelling and grammar is “pretty much accessible” too.

    As for the meat of your posting… I agree with you about how difficult it is for laymen to interpret mixed colours and draw conclusions from the visual representation. I also commend you for highlighting how this idea is not really new and how it breaks certain inherit principles.

    • Enrico

      Dear Anonymous,

      Thanks for pointing this out. It makes sense. Anyway, I want to make sure you know I do all my best to write well polished posts. Every single post takes many hours to be created. It’s not something I write just as it comes out of my mind. It’s a lot of work!

      I am surprised you mention spelling errors because I always use a spell-checker while editing my posts. I double-checked this one again after you comment and my spell-checker doesn’t bring anything to light. I’m confused.

      Regarding the grammar well … I do my best. Really! English is not my mother tongue and of course some errors might slip in from time to time, I am sorry. The alternative is either to close the blog and keep these ideas for myself or to write it in Italian. Both cases look sub-optimal for me and for my readers.

      By the way if you would like to send me a separate email or comment in which you mention what parts of the post created most of the problems to you I will be happy to learn something and change it. And if you would like to do it for the other posts too that would be fantastic!

  • Pingback: Jerome Cukier » Are the richest Americans the best educated?

Previous post:

Next post: