Infoviz on top of stat graphic on top of spreadsheet

Kaiser points to this infoviz from MIT’s Technology Review:

6a00d8341e992c53ef019b019f9895970d-500wi

Kaiser writes:

What makes the designer want to tilt the reader’s head?

This chart is unreadable. It also fails the self-sufficiency test. All 13 data points are printed onto the chart. You really don’t need the axis, and the gridlines.

A further design flaw is the use of signposts. Our eyes are drawn to the hexagons containing the brand icons but the data is at the other end of the signpost, where it is planted on the surface!

Here is a sketch of something not as cute:

6a00d8341e992c53ef019b019fafa1970b

I [Kaiser] expressed time as years . . . The mobile-related entities are labelled red. The dots could be replaced by the hexagonal brand icons.

I agree with all of Kaiser’s criticisms, and I agree that his graph is, from the statistical perspective, a zillion times better than what was published. On the other hand, unusual images can get attention. Recall the famous/notorious clock plot from Florence Nightingale. This is why I’ve moved to the idea of accepting both styles. Maybe Technology Review could feature their arty graph, but then when the reader clicks on it, they go straight to a statistical graph. And then another click could go to a spreadsheet with the raw data (and as much metadata as needed).

I like the lines in Kaiser’s graph. Strictly speaking, the points convey all the information, but the lines indicate growth, which is what it’s all about.

P.S. On the details, I think Kaiser’s graph could be better. I’d like to see labels on all the points; one option would be to put the y-axis on a log scale so it will all be more readable. Also, then the straight lines on log scale will correspond to exponential growth, which might be more realistic than linear for most of the data. He could also play around with having the x-axis be absolute time rather than relative time, so that we could see when each platform started. (Recall also one of our core principles of graphics, which is that there’s no need to limit ourselves to a single display.)

13 thoughts on “Infoviz on top of stat graphic on top of spreadsheet

  1. > … then the straight lines on log scale will correspond to exponential growth, which might be more realistic than linear for most of the data.

    Drawing exponentials through a single data point, eh. Have you considered a second career in marketing;-) More seriously, I concur with your log scale, labels on points, and absolute time suggestions. I just do it up as a scatterplot though – no lines or curves except possible for PC data.

  2. There has to be a way to convey graphically that this values are peaks, not just a passing through stations to further heights. Straight lines (not to say exponential lines) are misleading in that respect. Also, starting each line on the product’s launch date may make this graph into a vermicelli mess.

  3. I’d also label the end point with the year underneath the label, that would add another potentially interesting dimension to the data.

  4. Rather than the lines in Kaiser’s graph, why not use real data series over the intervening years? Or is that data not available?

  5. I think there’s no point in debating straight lines, scatterplots, time dimensions etc. unless we know what the intent of this visualization is. Those are all great ideas, but what is the relevant comparison we want to focus on? If the intent is to tell a clear story, one would probably need to produce a few graphs each one incorporating those suggestions. The original infoviz is horrendously memorable, yet amazingly incomprehensible.

    • > I think there’s no point in debating straight lines, scatterplots, time dimensions etc. unless we know what the intent of this visualization is.

      I disagree. You can rule in and rule out some approaches based on general good/bad practice. You can formulate a subjective prior for attributes of an informative graph. You don’t drink red wine with fish. You don’t draw a curve through a single data point. (If you got an a priori model for the data then fine, draw the curve, but don’t show a best fit curve if the uncertainty associated with the best fit renders the result meaningless.) You can apply general rules without knowing specific intent.

      • Chris-I was taking the “good practices” for granted here. Good practices as in Tufte or Tukey. What I was questioning is the intent of the visualization, and without knowing that it’s hard to go beyond some general advice. Is it growth? Relative comparisons? Changes over absolute time? In all of these I can see a different way to focus on the comparison.

  6. Without endorsing the claims of “The Singularity is near by Ray Kurzweil, Chapter 1 has many useful technology history curves, with both linear and log scales. The challenge of exponential growth is that they are often really S-curves and the inflectiosn are sometimes not obvious except in the rear-view mirror. Put another way: Moore’s Law had a great run for CMOS, and it’s not done, but it’s getting much harder and more expemsive, and the easy clock-rate performance boosts from transistor shrinks are long gone.

  7. A further problem with the original graphic, if I understand it correctly (without reading the article), is that the different pin lengths associated with the software platforms convey no factual information, which is to say they convey misinformation. They make visual organization look like it’s representing a third dimension.

Comments are closed.