What went wrong in the labeling of those cool graphs of y(t) vs. y'(t)?

Last week we discussed the cool graphs in geographer Danny Dorling’s recent book, “Slow Down.” Here’s an example:

Dorling is plotting y(t) vs y'(t), tracing over time with a dot for each year, or every few years. I really like this.

But commenter Carlos noticed a problem with the above graph:

Comparing 1970-1980 to 1980-1990 the former period shows lower annual increments but the ten-year increment is twice as high.

That’s not right!

So I contacted Dorling and he told me what happened:

The diagram has been mislabelled in the book – the dot labeled “1994” should actually be labeled “1990” (the labels were redrawn by hand by an illustrator).

I had not spotted that before. Below is what the graph as I drew it before it went to the publisher. Thanks for pointing that out.

Spreadsheet also attached in case of use.

It’s interesting to compare Dorling’s graph, which already looks pretty spiffy, with the version at the top of this post drawn by the professional illustrator. Setting aside the mislabeled point, have mixed feelings. Dorling’s version is cleaner, but I see the visual appeal of some of the illustrator’s innovations. One thing I’d prefer to see, in either of these graphs, is a consistent labeling of years. There are two dots below 1600, then a jump to 1800, then every ten years, then every one or two years?, then every ten years? then every year for awhile . . . It’s a mess. Also I can see how the illustrator messed up on the years, because some of them are hard to figure out on the original version, as in the labeling of 1918 and 1990.

Dorling adds:

Spreadsheets are here.

Just click on “Excel” to get the graphs without the pendulums – and of course with the formulae embedded. There are a huge number of excel graphs there as there are many sheets with each sheet (far more than in the original book).

The key thing folk need to know if they try to reproduce this graphs is that you have to measure rate of change (first derivative) not at the actual point of change but from a fraction before and after the point you are interested in.

We put over 70 graphs in the paperback edition of the book so I’m happy with the error rate so far. The illustrator was lovely, but as soon as you edit graphs by hand errors will creep in.

She added quite a lot of fun symbols to some of the later graphs. Such as the national bird of each country on the baby graphs (so they were not all storks!)

If you send me albino to the blog I will tweet it.

I guess that last bit was an autocorrect error!

In all seriousness, I really like the graphs in Dorling’s book, and I also want to emphasize that graphs can be useful without being perfect. Often it seems that people want to make the one graph that does everything. But that’s usually not how it works. One of the key insights of the world of S, or R, or the tidyverse, is that much can be learned my trying out different visualizations of the same data. Indeed, “the data” does not represent some fixed object, and the decision to perform a statistical visualization or analysis can motivate us to bring other data into the picture.

Dorling had some comments about his use of graphs which have some mathematical sophistication (plots of derivatives):

I really wish more social scientists would use these kind of graphs. One tricky thing in social science is that so many of us are averse to numbers and graphs that it becomes very hard to say: “Look here is a type of graph most of you have not seen before and it shows something interesting”. On reason to have an illustrator work on that graphs is to make them more “user-friendly” to try to get people to look at the graphs rather than just read the words.

Half of my first degree was in maths and stats, so I am happy with these things – but most folk in geography, sociology and even economics are not actually that happy with all but the most simple graphs. We did some of the pandemic and in hindsight they are quite informative as it has cycled around again and again since then.

They only appear in the second edition – and only show wave 1, but almost every country in the world has now had several waves (maybe 6 waves in Japan) – which is what a disease becoming endemic may produce. The waves for Western Europe spiral down thanks to so many vaccines. Although I have not published these.

Also just great that he has the spreadsheets right there.

9 thoughts on “What went wrong in the labeling of those cool graphs of y(t) vs. y'(t)?

  1. > We put over 70 graphs in the paperback edition of the book so I’m happy with the error rate so far.

    I only looked attentively at one of those charts, so I’m happy with the error discovery rate so far.

    Seriously, it’s great to to see the follow up. And I like this kind of chart. I recently saw somewhere (on twitter, maybe, I can find it now) an interesting chart showing daily admissions and total occupancy in London hospitals, comparing path of the different waves. But they may be too complicated to interpret for the general public.

    • > I recently saw somewhere

      I couldn’t find it so I made my own. With blackjack and hookers.

      https://imgur.com/a/kw6UHFc

      It was something like the chart on the left. The one on the right is more like Dorling’s chart, showing on the x axis the changes of the variable represented on the y axis. The first chart shows new admissions – which are always positive – ignoring outflows due to death or discharge.

  2. Huge thanks to Dorling for everything represented here: the initial work, sharing the data, and the discussion in the responses. Also noteworthy that the illustrator gets praised for adding interest rather than blame for making the inevitable mistake once per thousand characters or whatever.

    It would be great if there were software that allowed the illustrator to modify the plot as desired but without actually redrawing it or relabeling it. I could imagine the ability to change the font of individual strings or characters; to drag a label and have a line automatically drawn to the point to which it is connected (and of course the line itself would also be editable for width and color and so on). Of course all of this can be laboriously hand-done in any illustrator program but I’m talking about something that would remain connected to the underlying data. Maybe such a thing exists, if so I’d be interested in hearing about it.

  3. Comparing the two graphs — the one in the book and the original — one thing that strikes me is that the narrower aspect ratio of the one in the book is very helpful.

    In “The Elements of Graphing Data” Cleveland recommends choosing an aspect ratio such that the slopes you care most about are not too far from 45 degrees (I’m not sure whether this was his idea or someone else’s); he shows a couple of plots of the number of sunspots as a function of time, one of which has spikes going sharply up and down, the other stretched out so the spikes go up and down at close to 45 degree angles, and only in the latter can you see that the slope of the rise is very different from the slope of the fall. Somewhat the same phenomenon shows up in Dorling’s plot, with all those sharp features between 1900 and 1950.

Leave a Reply

Your email address will not be published. Required fields are marked *