Making better data charts: From communication goals to graphics design

JoElla Carman at the Urban Institute blog writes:

Like a growing number of organizations, the Urban Institute maintains a style guide for data graphics. This guide specifies almost a dozen elements that should appear on charts, like axis labels and source lines. . . . [But] the guide says a lot about how and what the chart creator should do but very little about why. And there’s nothing like understanding why something is being done to align a team around the hows and whats.

With that in mind, I [Carman] would like to turn the style guide inside out, starting with our guiding intentions and connecting them to the specified element of a chart.

I like this advice, and I’ll excerpt it below. I also think there are places where the advice could be even better, and I’ll share that too.

So here goes. All words are Carman’s, except for my additions in italics inserted brackets.

1. A message

Focus this with selected data.

A story isn’t everything that happened. Likewise, a good chart isn’t all the data that ever existed. It’s an honest — not cherry-picked — selection of data put in a frame to communicate a point. . . .

[My suggestions:

a. Rather than using a legend at the top of the graph, just label the lines directly. This will convey the message without the reader needing to go back and forth.

b. Improve the x-axis by putting year labels every ten years and tick marks every five years. Again, this allows you to make your point without the reader needing to work to figure out what is happening when.]

2. Purpose

Tell the reader where you’re headed with a title.

Like most communication lengthier than a tweet, a chart has a title to focus the reader and help her decide if she’s interested in reading on.

As this list item’s name implies, there a few ways to think about the role of the title.

A very newsy way to write a title is to “say what you see” and describe what’s on the graph, like “Half of respondents don’t like peas” or “Peas fail to please”. These are sometimes paired with a secondary statement (set in less prominent type) that unpacks the title, sharing details for those who have bought in to reading the chart and want to learn more — “A survey by XYZ Polling Service found 51.6 percent of people either disliked or strongly disliked the taste of peas.”

Another way to summarize the same graph would be, “Share of survey respondents indicating dislike of peas.” . . .

[I agree 100% that a direct title is best. I don’t always do this myself but I probably should.]

3. Context

Put the issue in perspective with axis lines.

You will rarely see a chart with just one data point on it. Charts show relationships, one thing to itself over time or different categories of things. These relationships, usually shown vertically and horizontally, are the grid under the plot.

The axis should be clearly labeled and the categories on it should be well thought out. Too many and the story is incomprehensible, too few and it’s hard to know if the individual values fit into a trend or break it. . . .

[I can’t figure out what’s going on in this graph! Some of these seem to be labeling groups of people by ethnicity (“American Indian,” etc.) or economic status (“Economically disadvantaged”), but other lines seem to correspond to variables (“Disability status,” which I guess would be Yes or No, or maybe it’s on a five-point scale or something?). It’s also not clear why the graph includes all these different ethnic categories but only one economic category. And I can’t figure out how these categories are ordered. I’m glad it’s not alphabetical but I don’t know what it is. What’s happening is I’m spending a lot of time trying to figure out how to read the graph, which is getting in the way of its intended message.]

4. Credibility (and courtesy)

Do this with a source line.

Showing where data came from helps readers judge the validity of the information and provides a path for further exploration.

A good chart provides enough of a trail that the reader could find the original information on her own.

[Yes yes yes yes yes. Yes. I think this is an important general point.]

5. Transparency/limitations

Do this with notes.

As much as we try, not everything that needs to be known about a chart goes on the chart itself. Sometimes, we still need something like footnotes. Here we can define terms, spell out acronyms, give margins of error (if not shown on the chart itself), explain what data might be missing from a series, or present other limitations, including how data were collected.

A good chart isn’t trying to obscure anything. Be sure to leave a help line for any confusion you think might arise.

[Yes again. As the saying goes, a picture plus 1000 words is better than two pictures or 2000 words. And indeed this point is demonstrated by Carman’s post that we’re writing about: neither words nor pictures alone would make her points.]

6. Clarity

Give your readers confidence about what they’re looking at with labels and/or a legend.

Charts are about visual communication, but that doesn’t mean the visuals should be asked to do all the work. Text has an important role, nudging the viewer toward your intended meaning.

Anything on the chart that’s encoded should be explained: the meaning of colors, heights, widths, and what’s on the axis, to name a few.

Bonus: Placing a label close to the thing it references (direct labeling), mirroring the order of categories from the chart in the legend, or mimicking their appearance are all good ways to strengthen connection between a label and its object. . . .

[Indeed; see my comment on item 1 above, which demonstrates the meta-point that good advice can be hard to follow. We’re busy people and we often don’t follow our own good advice.]

7. Identity

Do this with a credit/byline or organization’s logo and branding.

If a chart really was just data you would only need a source. But the choices you make in creating a chart give it a frame, so own it! . . .

[That’s a good point. I don’t usually do this, but I see how it can make sense, especially on the internet where graphs get copied and sent around without their original context.]

8. A pathway

In a crowded landscape, help the reader get to the point with visual hierarchy.
Make the chart easy to ready by directing the reader’s attention with size, color, and placement.

Text that’s critically important should be big and bold; helper text should be small and light.

Prevent background elements from competing with the main event by turning them down, e.g. mute grid lines and contextual categories with grays or subtle colors.

[Yup. This comes up a lot. It would be good to have an example here. OK, I’ll give my own, from this recent post. Here’s the graph:


In this example, my primary goal—really, my only goal—was to amuse, and part of the gimmick was to include jokes while staying within a legitimate statistical graphics framework. So I ordered the franchises in time order and I kept the histograms in gray. The most prominent aspect of the graph are the franchise names, and that was the point, because I wanted to for the reader to have the fun of recognition: “Yuma . . . 3 . . . what’s that . . . oh yeah, 3:10 to Yuma,” “Hawaii . . . 5 . . . Hawaii 5-0, ha ha,” and so forth. My point here, echoing Carman in her post, is that the decision of how to put the graph together—down to the details—depends on your communication goals, that is, on why you decided to make and post this graph in the first place. Yes, there are some general principles and some advice that’s often useful (ordering of variables, if zero is in the neighborhood invite it in, put the labels next to the lines, etc.), but ultimately you should be thinking about your immediate communication goals.]

9. Your work

Let the reader in on the thinking behind the chart by using detail in the units.

The numbers on a chart are interesting because they measure a specific thing. Whether it’s straightforward like “population (in millions)”, a little more layered like “density (people per square mile)”, or a lot more layered like “potential users (people within 10 miles without a car)”, well written unit labels will help readers understand what they’re seeing.

[So true.]

10. Nothing at all

Finally, sometimes the best chart is no chart!

A data visualization can take place in the imagination, as with a comparison like “the airplane is as large as a football field” or the insight from some data might be easily summarized with a multiple (“twice as many adults 65 and older lost jobs as any other age category”).

[I guess this is related to the idea that a talk can go better without any slides. Engage the audience’s visual imagination and you’ve extracted a greater cognitive commitment from them.]

One other thing. The above discussion is all about presentation graphics. But the same principles hold for exploratory graphs that you’re making for a research group, or just for yourself. Remember the saying, “Your most important collaborator is you, six months ago—and she doesn’t answer email.” Even if your own goal in a graph is to explore and enhance your own understanding, this is still a communication goal and it should inform your design choices.

And this relates to another point, which is the value of flexible software that allows you to express design choices, so that you can make effective visualizations, for yourself and others, while not breaking your time budget.

14 thoughts on “Making better data charts: From communication goals to graphics design

  1. The software point is important. Andrew is always telling people to put labels on the lines, not in a legend, but ggplot’s defaults are to include a legend. So I always just ignore those comments because it’s too much of a pain to figure out how to change the defaults. Also, I don’t seem to have as much trouble associating colors/shapes, etc. with lines as Andrew does, though I do agree that in most of these cases, including the first example here, labels on the lines would be easier to read.

    I think the graph you were having trouble interpreting is all indicator variables and it’s just telling you the difference between “adjusted” and “unadjusted” estimates for change in kindergarten enrollment. I know that what you want is an interactive tool that lets you visualize an MRP estimate of all this. I have no idea why they’re sorted this way.

    P.S. For once and for all, I’m not a robot. I’m not a robot. I’m not a robot. At least it’s not making me label stoplights and pedestrians.

  2. One semi related thing I want to bring up is possible solutions for visual impairment accessibility. Social media sites like Twitter (I believe) allow users to provide small qualitative descriptions of photos they upload so that the visual impaired can “view” a photograph descriptively.

    I was thinking it might be a good idea to incorporate such a thing into scientific articles for data visualisations? Not only does it help the visually impaired, but writing a text description of a graph might be a way of encouraging authors to consider more about how they’re displaying the graph to begin with.

    One downside is the obvious dent to the time budget; a clear, informative description is often hard to write even if it’s short. But might be something worth considering, if it hasn’t been brought up already.

  3. Regarding #3, I think in this case the labels are too close to the data. At least putting them in a column to the left or right would allow you to scan them and decide which ones you cared about.

    Another odd thing about this chart is the ordering–the slices are ordered based on the size of the adjustment. The article this comes from (https://www.urban.org/urban-wire/better-data-use-shows-depths-pandemic-prekindergarten-crisis) makes the point that adjusting the data is important, fine, but the accompanying text still talks about the adjusted numbers, not the degree to which the numbers were adjusted. (In other words, this is still an article about pre-K enrollment, not a case study about which groups were most affected by the adjustment.) The unadjusted numbers are there for context (presumably because they correspond to the numbers in the chart above it) but the point of the article is that they’re not very meaningful.

    It would make more sense to order these by the adjusted numbers, perhaps within groups such as ethnicity. Presentation of those categories in a way that reinforces visually what it is about this collection of points is meant to be important is another way to keep the story from being incomprehensible.

    I’m not a robot. Thanks for asking!

    • Jeff:

      Good points on chart #3. One way to think about this more systematically is to think about the goals of the chart. All graphs are comparisons, so what is being compared? If you want to compare different ethnic groups, for example, it makes sense to put them together, or to use the same color for them, or maybe both.

  4. Super minor point, but might help clarify one thing about #3. In regards to your comment “… And I can’t figure out how these categories are ordered…” I think they’re ordered by (unadjusted-adjusted). It seems like the top row is the group with the largest “positive” gap between those two measures, “economically disadvantaged” is the group with the smallest “positive” value for that measure, Black is the group with the “smallest negative” value (or the first where adjusted > unadjusted), etc.

    Not saying it’s a good metric one way or the other, just helping clarify a small point!

  5. I’ve spotted a few weird data charts on a Nature paper (“Efficient and targeted COVID-19 border
    testing via reinforcement learning” here https://www.nature.com/articles/s41586-021-04014-z ) and would appreciate a valuable feedback from Andrew and the blog members.

    So the algorithm on the paper supposedly allocates covid tests to the riskiest passengers. But notice Figure 9 of the supplement. It does seem the algorithm can not discriminate well between destinations (credible intervals – that they call confidence intervals but that is another matter- mostly coincide, do also note the wider CI on higher prevalence locations). How could someone possibly use that input to allocate tests? And if someone does, isn’t it that testing and exploratory behavior could be inferior to random testing?

    Do also notice Fig. 3 of main paper. Seems the less you test the more benefit you get from using the algorithm (over 4 times more when you do no testing at all!!!). But wouldn’t we expect a concave form on the benefits from any algorithm that works that way (no matter how efficient)? I mean when tests available too few to get any sensible estimate about prevalence (hence when you are ignorant) why would anything work any better than random testing and why shouldn’t you first focus on getting good estimates? Why would very noisy prevalence estimates (based on too few data) be a good guide to your sample selection and work any better than totally random selection (especially when data non-stationary, as authors accept)?

    Thanks!

    • An important feature is that the Eva algorithm “strategically allocates some tests to traveler types for which it does not currently have precise estimates in order to better learn their prevalence (exploration).”
      If you compare supplement figure 11 to figure 9, you’ll see that the traveler groups with the wide error bars were fully tested, and that the remainder of the test capacity was preferentially allocated to traveler groups with higher risk.
      This is possible because wider error bars correspond to smaller groups; a wide error bar basically means “test everyone here”.

      In effect, the algorithm always fully tests small groups, and then “decides” which large groups to allocate more tests to.
      Allocating fewer tests to large low-risk groups makes the algorithm perform better than random testing.
      (Eva would perform worse than random sampling if all of the large groups were high-risk and all of the small groups were low-risk, but apparently that doesn’t tend to occur.)

      Do also notice Fig. 3 of main paper. Seems the less you test the more benefit you get from using the algorithm (over 4 times more when you do no testing at all!!!). But wouldn’t we expect a concave form on the benefits from any algorithm that works that way (no matter how efficient)? I mean when tests available too few to get any sensible estimate about prevalence (hence when you are ignorant) why would anything work any better than random testing and why shouldn’t you first focus on getting good estimates?
      You’re misstating what happens. These figures are relative, and a lesser percentage of travelers tested is not caused by fewer tests being done, but by more travelers arriving.
      As I have quoted above, the algorithm does “first focus on getting good estimates” for each group. Happily, the sample size for a good estimate doesn’t depend (much) on the group size: whether you sample 50 of 2000 or 50 of 5000 makes little difference for the quality of your prevalence estimate.

      The efficiency of the algorithm over random testing is caused by it directing tests away from large groups of low-risk travlers towards large groups of high-risk travelers, and that works better the more “large” groups it works on, and the scarcer the resources are. (If you could test everyone, there would be no benefit, but all groups would be “small”.) Therefore, the test becomes more efficient over random sampling the more passengers are there to be tested, and with a constant number of tests, that means it works best when relative testing capacity is low.

      I don’t think that’s weird.

      • It looks like I messed up the quote formatting; the quote was supposed to end after the first paragraph, and “You’re misstating…” is the start of my response.

Leave a Reply

Your email address will not be published. Required fields are marked *