The Data Visualization Beginner’s Toolkit #1: Books and Other Resources

StudyingOne of the main goals of this blog, other than challenging the status quo with reflections at the intersection between academics and practitioners, is to help people become data visualization experts. It’s not rare for me to receive emails from people who are enthusuastic about visualization but have little guidance about how to become an expert.

I have been posting some few articles in the past with this specific goal but I realized that they are too scattered and not organized in a way to represent an organic resource for the readers.

For this reason, I decided to create a series specifically designed to help those of you guys who are excited about visualization but really don’t know how and where to start. The series is meant to be part of a permanent collection in FILWD and it’s my first serious attempt to react to my own call to action: “When will we decide to provide lots of value?“.

Introducing the series

The Data Visualization Beginner’s Toolkit will function as an orientation guide for poeple who need guidance in finding the right resources to become data visualization experts. In the guide I will not be teaching visualization directly, nothing technical or theoretical about it (I have plans for this later), but I will show you the resources and one path.

Having such a guide is particularly important today because data visualization is really just like a jungle. There are plenty of opinions, blog posts, research papers, consumerist visualizations, books, etc., and it’s very hard to separate the wheat from the chaff.

When reading this series please keep in my this is my very personal view and, as such, is limited to my own experience. Also, whatever list I will propose is certainly neither unique nor exhaustive. If you are looking for an exhaustive list of resources I highly recommend you Andy Kirk’s collection of data visualization resources.

Here is a tentative list of topics I am planning to cover in the series (subject to changes):

  1. Books and Other Resources
  2. Programming Languages and Tools
  3. Sources of Good Examples
  4. Research Papers
  5. University Courses

Please if there is anything else you would like to be covered let me know! Send me a message or add a comment below.

Books about Visualization

There is a reason why I start the series with a list of books: if you don’t know the basics of data visualization you will always be an amateur. And what’s worse, visualization experts will notice it and will not take your work seriously.

Also, orienting yourself in the mess we have right now might prove discouraging and prone to errors. If you type “data visualization” in Amazon the result is a disaster, believe me.

Finally, even if you end up picking up very good books, it is definitely possible they are not the right ones given the amount of knowledge and expertise you currently have. Here I suggest the following path (in order).

Show Me the Numbers: Designing Tables and Graphs to Enlighten (To acquire solid foundations). This book teaches the basics of visualization by using only tables and simple charts. You won’t find fancy and colorful visualizations, only scatter plots, bar charts and stuff like that. But that’s the way to go! If you understand the basics then it’s a lot easier to spot the limitations of basic graphs and go beyond them. Plus the book contains the best summary of visual perception applied to visualization I know. It really is a true gem. Don’t make the mistake to be attracted by fancy stuff and skip the basics, start here and you will have very solid foundations.

Readings in Information Visualization: Using Vision to Think (Chapter 1 only) (To go beyond simple charts). Once you understand how charts work and you have learned the basics of visual perception, you are ready to explore fancier stuff. Yet you need some guidance on how to explore the huge data visualization space. The first chapter of this book is the best self-contained piece of work I know. It’s able to provide all it’s needed to start thinking more creatively, but in a structured manner, about advanced visualizations. The book also has a strong emphasis on interaction which is important. If you want to go beyond the first chapter fine, but the book itself is a collection of papers and many of them are totally outdated. But wait a moment, this doesn’t means you cannot find useful material there! In the collection you can find fundamental papers that are totally worth a read: the work of Jacques Bertin above all.

The Visual Display of Quantitative Information and the rest of Tufte’s books  (To learn what “graphical excellence” is). People go crazy with Tufte’s book and I understand why: they are totally beautiful, the cover, the format, the colors, the contet, everything. But regardless their beauty, I have always thought it’s really hard to learn something out of them; they require you to think really deeply about what you see. Basically they are “just” a collection of images. The Visual Display of Quantitative Information is the first one and is the only one I truly recommend because it give more guidance than the others. The others are wonderful but you will have to spend more time on them to translate their content into design practices.

Information Visualization: Perception for Design (To know what happens in our brain when we see a visualization). If you have read all the books cited above congratulations! You have learned really a lot. Now, information visualization is deeply rooted in visual perception and cognition. If you want to master the art of visualizization, at some point you will have to know these basics; especially if you aspire at designing innovative visualizations that fit people’s needs. This book starts from the very basics of human vision (e.g., how the eyes work) up to how we think with visualizations. It’s a tough read but it’s totally worth it. You will have to spend quite some time thinking how these theories apply to your specific projects, but believe me, it’s a true investment. I experienced countless situations where a visualization design problem was deeply rooted in one of the issues discussed in this book. You will find yourself referring back to it all the time.

More Books about Visualization

Important: Are the books not mentioned above bad or not worth it? Absolutely not.

It is important to consider two factors: (1) there are several books I have never read or even skimmed through which might provide some additional value to you; (2) there are extremely valuable book I’ve not included just because they are either too advanced or don’t fit the progression of readings I am proposing here. Please keep in mind: I am suggesting you to read these books in the order I gave above.

A few additional books that come into my mind, which need at least a short mention are:

  • Any other book written by Stephen Few.
  • Any other book written by Edward Tufte.
  • The statistics-flavored and super-classic Visualizing Data and The Elements of Graphing Data by William Cleveland.
  • The monumental Semiology of Graphics by Jacques Bertin, which I did not include because it is still hard to get despite a new edition came out and because it’s really a hard read for non-experts.
  • The extremely beautiful and information rich Visual Language for Designers by Connie Malamed, which I did not include because I haven’t finished reading it yet.
  • The deep and dense How Maps Work by Alan MacEachren, which despite the title teaches visualization and makes you think deeply about it.
  • The not known enough and little gem Designing Visual Interfaces by Kevin Mullet and Darrel Sano, which teaches aesthetics in a functional and systematic manner.

Books NOT about Visualization

It’s important to acknowledge that not all the knowledge a visualization expert needs comes from data visualization books. I have no intention to write another long list of related disciplines’ books, but it’s important for you to know that a good data visualization expert may have strong foundations in areas such as: statistics and data mining, data management and manipulation, human-computer interaction and cognitive science.

I don’t want to scare you: you can start doing visualization without these, but little by little you likely will find yourself digging more into these areas.

Also, let me stress the importance of human-computer interaction and related areas. While the rest is normally acquired, at least on a superficial level, by using various technologies you encounter along the way, human-computer interaction has a less technical flavor and you might not learn anything of it unless you seek it.

Knowing how people reason and interact with user interfaces is a crucial skill, the real differentiatior, that you’d better acquire if you want to become a pro. I cannot stress this point enough. Visualization, as any other user interface, happens in people’s mind, not in the computer! And if you want to design great ones you’d better learn how people’s mind work.

There is only one book I feel like suggesting as a starting point: the brilliant, super-practical, and freely-available Task-Centered User Interface Design.

Other Learning Resources

Unfortunately, other than the books I mentioned above, there are not many other sources from which you can really learn something. But luckily there are some few notable exceptions! Tamara Munzner and Jeff Heer, top-researchers in the field, share the material of their courses freely on the web and you should not miss them for any reason:

1) Tamara Munzner’s InfoVis Course Slides at University of British Columbia
2) Jeff Heer’s InfoVis Course Slides at Stanford University

These are university courses, with a specific target, but I cannot think of a more carefully and better organized set of information covering the whole theory and practice of information visualization. What is really unique in these courses and their material is the way this information is organized. Information visualization is still a young discipline and nobody really agrees yet on the content and order to use when teaching it. These two courses found in my opinion the perfect balance between coverage and organization.

Another great source for learning data visualization are Stephen Few’s articles and white papers, which teach a whole lot of fundamental data visualization skills with his usual concise and effective style.

Can I get all the knowledge I need with these books? No.

And there are two main reasons. First of all, one of the biggest and surprising gaps I see in the current literature is a book that teaches systematically how to design a visualization from scratch. I am really surprised. Ben Fry in his Visualizing Data has a few elements of it, but since the book essentially teaches also how to use Processing the whole thing is a bit too diluted. Apart from that, I am not aware of any book that fills this gap (please let me know in case you know one).
A second issue is that there is no book that can really teach you to be a great data visualization designer. The only way to become an expert is to actually design your own stuff and iterate over and over on it until you perfect your skills. Studying and reflecting is important, but doing is equally, if not more, important. The two things complement and enrich each other.

Conclusion

That’s all folks. I really hope this series will be useful to you. Let me know what you think and if it helps. Also, I’d really love if you could enrich it with your suggestions. You can write comments below or send message to me on twitter at @FILWD.

Please do not forget to share this with your friends or people who might benefit from it. Its main purpose is to let you guys become better data visualization experts. Help me to spread the word around.

Take care, have fun,
Enrico.