Alberto Cairo’s visualization course

Alberto Cairo writes:

Every semester I [Cairo] teach my regular introduction to information design and data visualization class. Most students are data scientists, statisticians, engineers, interaction designers, plus a few communication and journalism majors.

At the beginning of the semester, many students are wary about their lack of visual design and narrative skills, and they are often surprised at how fast they can improve if they are willing to engage in intense practice and constant feedback. I’m not exaggerating when writing “intense”: an anonymous former student perfectly described the experience of taking my class in RateMyProfessors: “SO. MUCH. WORK”.

Indeed. The only way to learn a craft is to practice the craft nonstop.

My classes consist of three parts:

First month: lectures, readings, discussions, and exercises to master concepts, reasoning, and software tools. I don’t grade these exercises, I simply give credit for completion, but I hint what grades students would receive if I did grade them.

Second month: Project 1. I give students a general theme and a client. This semester I chose The Economist magazine’s Graphic Detail section, so a requirement for the project was that students tried to mimic its style. Once a week during this second month I give each student individualized advice on their progress prior to the deadline. I don’t give most feedback after they turn their project in, but before.

Third month: Project 2. I give students complete freedom to choose a topic and a style. I also provide weekly feedback, but it’s briefer and more general than on Project 1.

He then shares some examples of student projects. The results are really impressive! Sure, one reason they look so good is that they’re copying the Economist’s style (see “Second month”) above, but that’s fine. To have made something that looks so clean and informative is a real accomplishment and a great takeaway from a semester-long course.

When I teach, I try—not always with success—to make sure that, when the course is over, students can do a few things they could not do before, and that they can fit these new things into their professional life. Cairo seems to have done this very effectively here.

31 thoughts on “Alberto Cairo’s visualization course

  1. Interesting to use the Economist’s Graphic Detail section: I’ve been struck by the nearly universal poor design of that section. It often takes me a long time to figure out what each display shows, and sometimes I just fail in the attempt. In addition, the design features (colors, lines, labeling, etc.) I often find confusing and distracting.

    So, my question is: are my perceptions shared, or is this used as examples of good displays?

    • Rereading my comment:
      1. Please read my corrected grammar: “are these used as examples of good displays?”
      2. I see that it is Andrew that has intimated that the Economist displays are good: “The results are really impressive! Sure, one reason they look so good is that they’re copying the Economist’s style…” This provokes me to ask Andrew, do you find that section of the Economist to be an example of good displays? Because I don’t.
      3. Question for Albert Cairo: do you view the Graphics Detail section of the Economist as a good example of data visualization?

      I ask these because I have frequently thought about writing to the Economist about just how poor I find those displays. They are unduly complex and it is often hard for me to figure out what the real message is. I also find the visual message does not match what the words are saying. I have been so struck by this each week (perhaps 10% of the displays are good, in my opinion) that I find it shocking that Andrew and Albert might find it otherwise.

      • If you want an example, try last week’s Economist: https://www.economist.com/graphic-detail/2023/01/24/where-have-all-americas-workers-gone. The content of that graph is interesting – how demand and supply of labor compare, and the second panel showing the relationship with the change in wages (not clear if it adjusted for inflation, though it should be). After a bit, I can figure out what the top graph shows – but surely there is a more direct display (and better labeling) to show that. How it is related to the second graph is far from clear to me. It makes me think of the final section of Tufte’s Visual Display of Quantitative Information:

        “What is to be sought in designs for the display of information is the clear portrayal of complexity. Not the complication of the simple; rather the task of the designer is to give visual access to the subtle and the difficult – that is,
        the revelation of the complex.”

        • A regular reader of The Economist would understand this in a second or two. It’s their standard layout. I prefer a traditional legend identifying the lines and shading, labeled axes, etc. But I don’t have a problem understanding this chart. In the print mag many charts are barely two inches square, so the info has to be communicated with the fewest possible number of graphic elements. In that sense their charts are a marvel.

          I presume the wages are nominal since it’s not noted otherwise. Also the wage data, maxing at ~6% annual increase, suggest the wages are nominal, approximately keeping pace with inflation.

          Overall though there is a higher point here: *readers can be trained to understand abbreviated presentations of information*. That’s critical: much of the “scientific” analysis of visualizations is based on the incorrect assumption that humans have a fixed way of understanding visual information. They don’t. They learn. As long as the “grammar of graphics” is *consistent* they can learn how to understand different types of presentations.

        • chipmunk
          As usual, we live on different planets. I am a regular reader of the Economist – have been for around 40 years. It did not take a second or two to understand the graph. It took longer, though the top graph wasn’t difficult. Still there are much more direct ways to show what the top graph shows, and the color shading distracts from the fundamental comparison of work force with openings. Just using two colored lines, clearly labeled would do a better job. Then there is the wage change graph tied to the top picture – I still don’t know what to make out of that.

          The only thing I find marvelous about that section of the Economist is that the graphs do look nice and draw attention. As Andrew says, that is important. But not as important as revealing the complex and not complexifying the simple – both of which that section is generally guilty of (in my opinion).

          I have also spent 30 years training people “to understand abbreviated presentations of information.” My primary use of these graphs in the Economist are to train people how to improve them.

        • Or try another:
          https://www.economist.com/graphic-detail/2023/01/03/americas-117th-congress-accomplished-a-lot-so-did-its-recent-predecessors
          I would say in general: the content is always of interest in the Graphic Detail section; that the graphs almost always try to do too much – too many variables displayed at one time; that the relationships are rarely clear and don’t easily match the verbal descriptions; the labeling is generally poor; and are eye-catching and look nice (if you don’t worry about the content).

          Most of the time, when I finally figure out what these graphs show, there are more direct and clearer ways to display the message(s).

        • In “Where have all American’s workers gone?”, he colored fill between the lines on the first plot does not bother me. I’m not sure whether I think it improves the graphic or makes it worse, but whichever direction the effect goes is small. But that’s me. Different people are different.

          My main complaint about the upper plot is the label “filled and open jobs.” A job that is filled is not open, and vice versa, so these jobs are not “filled and open.” It’s filled jobs plus open jobs, or filled jobs and open jobs. Of course I realized in under ten seconds that that’s what they meant, but that was a bit of cognitive load that could have been avoided: there’s plenty of space on the graphic, so they could have just said “filled jobs plus open jobs”, or perhaps “jobs plus unfilled positions” or something.

          Also, I wonder whether they have the data on filled jobs separately from open jobs, and, if so, whether that would be worth displaying. Are there currently way more ‘open’ jobs than has typically been the case? The question seems relevant to the point they’re discussing, so if they have the data I’d like to see them. But I wouldn’t want a whole bunch of lines (jobs, available positions, total of the two, and size of workforce). Perhaps a stacked area plot, with jobs plus available positions stacking up to give the total of the two, and then just a line showing the size of the workforce. So I think they might have had the data to easily improve that plot, but I don’t think the existing one is bad.

          As for how the lower plot connects to it: whenever I hear that someone is having trouble filling a given position, my first thought is that they aren’t offering a high enough salary. To me, and perhaps other people, it’s natural to wonder whether jobs are generally paying less, more, or the same as they were pre-pandemic. So it makes sense to have some sort of wage plot to go with the jobs plot. But I’m with Dale inasmuch as I don’t know what to make of the specific plot they’ve chosen. Perhaps the biggest problem is that the lower plot shows the change in mean wage from a year ago, not the actual wage. Am I supposed to try to integrate it in my mind? Also, as the plot emphasizes by pointing out “one year since start of pandemic”, it’s kinda weird to compare to “a year ago” rather than, say, “a month ago”: when you see that huge dip its easy to think that wages suddenly dropped, but no, that’s just what happens when wages stay the same but had experienced a giant spike exactly a year earlier. So I really don’t like that lower plot, either on its own or in the context of the upper plot.

          Moving on to the Congress graphic, though: I like it. Some systems are complicated and can’t reasonably be boiled down to a single graphic with a small number of elements, and this is a good illustration. If you’re discussing the role of partisanship in congressional function (as measured by major bills passed) then…well, what can you do? At least for starters it makes sense to ask: how many major bills were passed, and how many were passed with substantial bipartisan support? The upper part of the figure displays that, and does a nice job — in my opinion — of pointing out some specific bills that we’ve all heard of. But you, or at least I, also want to know whether Congress was split or was controlled by a single party, and whether the President was a Democrat or Republican. I find this graphic very effective at conveying that information. (And by the way, how is it that I did not previously realize that there has not been a time in modern history when both branches of Congress had majorities of the opposite party from the President? If asked, I would have guessed that that happened twice, during the Reagan administration and the Clinton administration. I thought the Contract On America happened when Republicans controlled Congress. I guess they only had the House.) Bottom line is that I quite like this graphic. That’s not to say it couldn’t be improved. Or, rather: perhaps I would have preferred to be spoon-fed some insights that I can only attain by basically doing my own analysis of the data in the graphic. I might prefer to see different numbers. But given that these are the numbers they wanted to display, I think they did a good job of displaying them.

        • Phil: “how is it that I did not previously realize that there has not been a time in modern history when both branches of Congress had majorities of the opposite party from the President?”

          Phil– I think you’ve misread the graph. There have been many times when this was true recently. For example, when Nixon was President both houses had Democratic majorities. (This fact is *not* in the graph.) That you couldn’t easily see that reveals a weakness in the graph.

        • Gregory,
          Ah, I see: I saw the bottom row was Presidency and assumed the row above it was for control of Congress. But no, it’s control of Congress _and_ the Presidency. WTF. So, OK, I don’t like the display. I like it the way I thought it was made, but not the way it’s actually made.

        • Just as another data point, I agree with Phil on all counts.

          I’m still confused by the major laws passed graph though. What do the circles at the bottom mean? At first I interpreted it like Phil did, as representing control of the House / Senate. But representing the House, Senate, and presidency I don’t understand what the half red / half grey circle means. Blue/red means Dems/Reps control all three, and grey means Dems and Reps each control at least one. So what could the half and half circle mean?

        • To all:

          The circles at the bottom show:

          Red: Repubs control all branches;
          Gray: divided government;
          Blue: Dems control all branches.

          The legend to this part of the chart is right above it in colored text.

        • @chipmunk

          1. “All branches” is wrong, the chart does not take in the split of the supreme court, only congress and the presidency.

          2. That does not explain the half red half gray circle. If gray is already divided, why divide the circle further? It could be that half gray means the presidency is one party and both houses of congress are the other, or that the president and one house of congress match and the other is the odd one out. Or it could be a half-term split (hard to tell because of the odd year axis). You can disambiguate by looking things up and counting circles and stuff. But can you do it in “a second or two?”

        • Somebody:

          1) No party “controls” the Supreme Court. It’s irrelevant.

          2) one of the seats changed in the middle of the session.

        • Chipmunk:

          You write, “No party ‘controls’ the Supreme Court.” That’s true in a literal sense, and indeed no party controls the executive branch or either house of Congress. The president is just a person, someone who runs on a party ticket but is not controlled by his party. Similarly, Congress is run by its members who are not controlled by any party mechanism.

          At the same time, currently in the U.S. government, the executive branch and the Senate are controlled by Democrats, and the Supreme Court and the House of Representatives are controlled by Republicans.

        • @chipmunk

          1.) It it commonly understood that there are Conservative/Liberal justices, corresponding neatly with their appointing president.
          2.) I’m not saying the graphic should take into account the split of the supreme Court, just responding to what you said about all branches
          3.) That’s incorrect. The graphic operates on 2-year increments. What actually happened is that the 107th Congress had a rare even split in the Senate. So the presidency and house of reps were republican and nobody had a senate majority. But it took me more than 1 or 2 seconds to figure that out

        • Last thoughts:

          “You can disambiguate by looking things up and counting circles and stuff. But can you do it in “a second or two?””

          Yes, I did. I know no president has left office mid-term since Nixon. The only other solution is a seat change in Congress. Members of Congress leave office in mid-term relatively often. I checked the Wikipedia plot to confirm, but it’s not clear from that plot.

          Knowledge is more useful than data.

        • Actually, now I’m not sure. There are other de facto ties where the Economist fills the whole circle in with the Presidency due to the tiebreaking vote. There are also other cases (2021-23) where control changes mid-session and the Economist doesn’t split the circle. So I don’t know. Maybe it’s because 2021-23 changed very soon into the session? Regardless, the point I’m making is that it’s not immediately clear what it means

        • @ all, I can’t resist one more:

          Economist charts:
          My experience with their charts is that the necessary information is always there. Once you operate on that assumption, it’s usually easy to find the necessary info. Just assume it’s there. Just like with the wage growth chart: no reference to “real” wages? then it’s nominal wages. What’s not there counts.

          Supreme court:
          It’s irrelevant to the chart. The chart is about legislative production in congressional sessions. The supreme court has nothing to do with that. It’s a great example of what would be unnecessary information.

          The court currently has a conservative majority. That’s not a Republican majority. The conservative members of the court may all be Republican. But they do not conduct themselves professionally on that basis. It’s a non-partisan institution.

        • Oh goodness

          @Somebody:

          Look closer at your congressional link re: 107th Congress (2001–2003):

          Majority Party (Jan 20–June 6, 2001): Republicans (50 seats)
          Minority Party: Democrats (50 seats)

          Majority Party (June 6, 2001–November 12, 2002): Democrats (50 seats)
          Minority Party: Republicans (49 seats)
          Other Parties: 1 Independent (caucused with the Democrats)

          So, yes, the majority did change. See? The Economist is always right!

          “The power shift was triggered by Vermont Sen. James Jeffords’ decision to leave the Republican Party, flipping the 50-50 Senate balance”

          OK, that’s my Andrew Gelman Blog time for a while.

        • @chipmunk

          Yes, I know it did in fact split mid-session. The issue with interpreting the half red circle as a mid-session split is that there was also a mid-session split in 2021-23 which is filled in solid. You can argue that the 2021-23 split was much shorter, but then there’s some undocumented cutoff for when a split becomes meaningful.

          My point here is that the mapping from the data to the chart is ambiguous. The issue here is that you’re taking “understanding the graphic” to mean “understanding how the graphic relates to the author’s main point” which, yeah, you can do in one or two seconds. Others are taking it to mean “understanding how the graphic relates to the data.”

  2. I think there may be a conflict between “looks pretty” and “communicates well”. I think the Economist displays are quite eye-catching. I too have some difficulty learning from them. I wonder if the course includes some sort of pre-post knowledge assessment of the students’ work. I know – that is NOT an easy thing to do properly either.

    I guess the old saying “There’s no accounting for taste” applies here.

  3. Alberto’s materials are well worth reading. One possible reason he chooses the Economist style is it has a highly distinctive, and stable style. I can definitely pick out an Economist chart from a blind test. I agree, though, that there are things I dislike about their style.

  4. Very impressive work by the students, and a good advertisement for the course. For those interested the Economist has a weekly newsletter (Off The Charts) where they sometimes discuss their philosophy and how their charts come together. I agree sometimes there’s too much going on and they miss the mark, but they’re usually pretty good. In terms of how long it takes to understand what’s going on in a chart, do other commenters find the Flow Map of Napoleon’s Invasion to Russia that easy to understand? I only ask because that is typically presented as a masterpiece in data visualization.

    • Blackthorne:

      The Napoleon graph is good in itself, but overall I think it’s had a bad influence on graphics by giving people the idea that there should be a single plot that shows it all. I usually prefer multiple small graphs on a page. Trying to show it all in one plot is a trap to be avoided!

  5. I feel like people are criticizing The Economist graphics without understanding how they’re used to support the thesis of the article. It doesn’t seem appropriate to suggest revisions to a graph that’s part of an article unless we know what the article says. With that in mind:

    The design of graphics varies depending on their use. Perhaps we can break down the use of charts and graphs into three categories, with a continuum between:

    1) Independent Communication (interpretative). Designed as standalone graphics to demonstrate a thesis.

    2) Integrated Communication: (interpretative) Designed to be integrated with text discussion to demonstrate a thesis.

    3) Data reporting: (non-interpretive) designed to show all or a summary of all of the data acquired for a project, with no interpretation or thesis.

    Alberto Cairo’s page contains visualizations that are between 1 and 2, but closer to the independent end. Charts in The Economist are very “#2”, created to support the thesis of the article. Research papers typically run the gamut between 2 & 3, starting with one or two graphics to show all or a summary of the data, and continuing in the discussion with graphics that are designed to support argument in the text. Finally, there are some papers or publications that just present data, with only bare minimum interpretation.

    People here often want to see things that aren’t on the chart, but if someone’s writing a piece to support a particular thesis, it’s not critical to show everything that everyone might want. The job is to show what’s needed to demonstrate the thesis.

    • It is precisely your point that bothers me about the Economist visualizations – they often don’t really support the text, or at least the connection is quite murky. The text makes clear points and the complexity of the visualizations is supposed to convince you that the evidence shows what they say, but that connection often fails in my mind. You can say it is my failure to understand what you can see in 1-2 seconds, but I’ll stand by my take.

      • “It is precisely your point that bothers me about the Economist visualizations – they often don’t really support the text, or at least the connection is quite murky”

        It’s been a long time since I read the Economist on a weekly basis and I don’t have access to these articles, so that may be so, and if so, it shouldn’t be so! :) I don’t recall that, but the 00s are a foggy bog in my memory.

        By pointing out that regular readers of the economist would readily recognize the format and features of the graph, it wasn’t my intention to belittle you at all. You’re a very capable person. Instead, it was to point out that people adjust to conditions – a frequently overlooked point in discussions about the supposed “science” of visualization and education in general, both of which often treat people as mechanical devices with permanent mechanical levers and gears that determine how everything is perceived, with no or very little adaptability. But the adaptability is the critical part, because that’s what lets us understand increasingly complex information with the same or less effort.

  6. If anyone wants to continue this topic, take a look at this week’s Economist data visualization: https://www.economist.com/graphic-detail/2023/01/31/habitat-loss-and-climate-change-increase-the-risk-of-new-diseases. The first two graphs are good – at least clear to me and conveys the message. The third graph is a poor stacked bar chart and unnecessary. I can figure out that the growth in foraging areas is mostly in the urban areas, although it isn’t a clear trend, nor are any of the component trends clear. Why not use a line chart, with a line for the total and lines for each component? The final graph I actually can’t figure out what it is showing – what are the sized circles, what is being measured, and does the graph mean that there were no transmissions of the Hendra virus during food shortages prior to 2012?

Leave a Reply

Your email address will not be published. Required fields are marked *