Skip to content

Displaying Financial Data, redux

A few weeks ago, I posted an entry about a bad graphical display of financial data; specifically, which asset classes have performed well, or badly, by year. Here’s the graphic:

I pointed out that although this graphic is poor, it’s not easy to display the same information really well, either. For instance, a simple line plot does a far better job than the original graphic of showing the extent to which asset classes do or don’t vary together, and which ones have wilder swings from year to year, but it’s also pretty confusing to read. Here’s what I mean:

I suggested that others might take a shot at this, and a few people did. Kelly O’Day sent
which is good for comparing variability of different classes, but bad for seeing which classes do or don’t vary together in time. Kelly also sent

Hadley Wickham sent this contribution:

(Hadley provides the R code, too, at I feel that I should note that this R code is both more elegant and more general than what I woulda done.) The lower plot breaks the asset classes into groups based on variance, which is nice. As with my graphic, though, the heavily overlapping lines and sometimes similar colors makes it hard to see exactly what is going on with what asset class.

Richard uses a tabular approach, where colors indicate yearly performance:

I would say that each of these has some good and some bad characteristics (even the original one at the top). It’s very hard to make a single display that lets you see both the relative and absolute performances, for each year and for the whole period. The original graphic gives up on the absolute performances (or at least gives up on graphically displaying them; you can still read off the percent gains), in favor of simply rank-ordering within each year and overall. My contribution, and the uppermost of Hadley’s plots, puts everything on a single line plot; you can see how things vary together, you can see the relative volatility (i.e. variance) of the various asset classes…but this is a lot of lines on a single plot, and is therefore hard to read. (Hadley’s color scheme is harder for me to distinguish than “my” color scheme, which was an attempt to duplicate the one in the original chart). Kelly’s two contributions attempt to resolve the overlapping-lines issue by presenting the data two ways: side-by-side, which allows visually comparing variances but does not help with comparing temporal behavior; and vertically aligned, which allows comparison of temporal variability but makes it harder to compare variances. Richard’s table is easy to follow, but (for me) much of the interpretation comes from reading the numbers rather than taking advantage of our ability to process graphical information.



  1. Hadley says:

    I agree with you about my colour choices – the problem is that it is very hard to come up with an algorithm for making nice colour choices, especially when the number of colours is so high. I am choosing evenly spaced colours from a "perceptually-uniform" colour space, but some colours look very similar to me.

    A more general solution, of course, is to use interactive graphics where brushing would considerably aid discrimination of different series.

  2. K says:

    Why use color at all? Two authors I greatly respect, Stephen Wolfram (see his book, A New Kind of Science) and Edward Tufte (all of his books are thoroughly enjoyable) both recommend using montone (black) instead of color. Whether monotone can be successfully used for signed values is another matter; it seems odd to use a gray color for zero and succesively lighter shades for negative values. I think it may be helpful to separate the rows in Richard's table to highlight the temporal progression of the data; I find it hard to 'see' variance over time as it is.

    Your final statement is interesting in light of Tufte's views; reading the raw numbers to interpret the data should be seen as a sign that the graphic presents the data accurately without obscuring its complexity.

  3. Hadley says:

    Using black and white in a book is often rather motivated by cost. Using 12 shades of grey would be even harder to distinguish in this example. Of the other visual attributes you could apply to a line (dashing, blur, alpha, size, …) I don't think any would do as well as colour. Combining multiple visual attributes would be another idea, but that brings costs as well (eg. no longer pre-attentive)

    Tufte is an excellent resource when it comes to presentation graphics. However, he gives no guidance for exploratory graphics which is where the strengths of interactive graphics lie. Generally, retreiving values from a graphic is of little interest (provided the graphic is honest)

  4. Anonymous says:

    We're trying to find a single graphic (or maybe two) that makes it easy to see several different things, such as: (1) how has any single asset class performed over time, (2) which asset classes do or don't have similar year-by-year behavior, (3) which asset classes are most and least variable, (4) which asset classes (if any) have usually been better or worse than others.
    I think that for the amount of data being displayed, there is no great solution — no single graphic is going to do everything really well.

    Of course you can work with tabulated data and determine these, with some effort…but as with all graphics, the idea is to not have to do the work: take advantage of our unconscious visual-processing systems to convey the message. To me, the graphs that come closest to doing this are the line plots that show all of the curves superimposed. Yes, they're kinda messy, but they come closest to showing everything. And as Hadley points out, they definitely need color—multiple line styles won't do the job. (For what it's worth, Tufte doesn't recommend doing everything monotone, either).