Interesting y-axis

Merlin sent along this one:

P.S. To be fair, when it comes to innumeracy, whoever designed the above graph has nothing on these people.

As Clarissa Jan-Lim put it:

Math is hard and everyone needs to relax! (Also, Mr. Bloomberg, sir, I think we will all still take $1.53 if you’re offering).

41 thoughts on “Interesting y-axis

    • I do not know but it looks like it would be hard to do.

      Graph with linear Y-axis

      library(ggplot2)
      score <- c(33,61,86,112,116,129,192,174,344,304,327,246,320,339,376)
      ss <- 1:length(score)
      dat2 <- data.frame(ss, score)
      ggplot(dat2, aes(ss, score)) + geom_point(colour = "red") + geom_line()

    • My guess at the sequence of events – the graph was created firstly with an unlabelled y axis, only with horizontal gridlines. At some point some honcho says “oh wait, you know it’d look deceptive if we don’t have axis labels, especially since the y-axis doesn’t begin at zero. So some intern has to go back and put in numbers where the gridlines actually are – but the gridlines were just an uniform pattern of lines chosen without any particular meaning, so the spacing between them is actually something like 33. Hence, result.

      • Zhou:

        It can’t just be that. Look at the numbers 33, 61, 86, 112. On the graph, the slope from 86 to 112 is clearly steeper than the slope from 33 to 61. From the positions of the circles on the graph, you can also see that going from 86 to 112 is more than one full bar on the graph, whereas going from 33 to 61 is slightly less than one full bar. , but s an But 61 – 33 is 28, and 112 – 86 is only 26. So how can this possibly work out?

        • That’s very subtle, good on you for catching that. That sort of mistake points to the graph being assembled manually in some kind of graphics/drawing package.

  1. Fox31 is a local TV station affiliated with the Fox Network, but the station itself is not owned by the Fox Corporation. So this screen grab of the graph isn’t from the Fox News Channel, the notorious cable channel owned by the Fox Corporation. The distinction is confusing, I know, but you shouldn’t use this graph to bash the Fox News Channel.

      • To belabor my point, it’s the equivalent of the connection between your independently-owned local ABC affiliate and the Disney Channel. Anyway, it’s a crappy chart no matter where it originally appeared.

        • I imagine a conversation took place between a producer who wanted a “data looking thing” and a graphic designer who they tasked with the job.

          Producer: Why isn’t there 100?
          GD: Well it’s not a multiple of 30.
          Producer: But people like seeing round numbers. Can you get a 100 in there somehow?
          GD: The graph doesn’t look as good when I scale it by 10.
          Producer: Just throw 100 into the current graph and make the points sort of match.

          I have been witness to conversations like this. This happens because people prioritize aesthetics over truthful representation of the data.

        • Mike’s scenario is more or less what I was thinking. Everything suggests manual construction by a non-data person.

          Even ignoring the labels on the grid lines, the vertical placement of the data points looks off. 192-174=18 and 344-304=40, but the vertical gaps between those pairs of successive points are about the same. It’s possible the trend was constructed and then it seemed easier to change the grid line labels than it was to move the dots up and down and fiddle with the connecting lines between them.

  2. The intervals are: 30, 30, 30, 10, 30, 30, 30, 50, 10, 50, 50, 50

    I think this was made point-by-point in something like ms paint without looking ahead.

    – First the numbers increased by about 30, so they used 30.

    – Then the next (112) was kind of close to 100 which is a *great* double round number to have on your chart so they put that one in probably figuring the numbers represented something like percentages so that was the maximum (sometimes you hear of percentages over 100 after all).

    – Then they returned to the 30 pattern for a bit using the 100 as a baseline (which is similar to zero). But unfortunately (for their method) they got a number way higher than 200 (344), so when they had to add more lines they figured they should increase the interval to 50. Ie, at this point the y-axis went from 190 to 350.

    – The next number didn’t increase a lot like they expected, it decreased by about 40 to 304, but that still fit with the 50 interval so great.

    – Then 327 was halfway between the last two so ok, no new line needed.

    – But then a huge drop to 246. At this point the y-axis went straight from 190 to 300. Should they add 50 to the 190 or subtract from the 300? Fuck it add them both.

    – Then the last 3 basically fit with the 50 interval near the top.

  3. Well if we’re trashing on data viz, all the COVID-19 maps get a D- for:

    1) switching the administrative entity level from day to day
    2) mixing administrative entity levels on the same map (comparing cities, counties and countries on the same level)
    3) using poor relative resizing so that day after day after day the relative sizes of circles for WA and NY are indistinguishable day to day, even as NY went from 500 to 5000 fatalities.

    IMO automated data viz and data viz research as supreme authority hasn’t yielded much over what common sense would tell us and frequently delivers bad results.

  4. Another problem is the sum is 3158 not 3342. The gap of 184 can’t be explained. The daily rate is below 100 on the left end and above 300 on the right end of this time window.
    Still mystified by the purpose of the manipulation. The person who created it clearly spent time messing around with it but why?

      • Is that sarcastic? The link said the total cases was 183 on that day, not new cases. In any case, it’s not as easy to explain how the numbers on the chart sum up to 3159 but the total printed on the top right is 3342. The designer would have to be looking at two inconsistent data sources, and why would anyone do that for a simple line chart?

        • I don’t know where do you see the difficulty, honestly. The total number of cases was 183 as of March 17 according to that article. If you now add the new daily cases given in the chart (33 on March 18, 61 on March 19, …, 376 on April 1) you can calculate the total number of cases as of April 1… and it happens to be 3342.

        • Thanks for that. I see there is yet another potential for confusion: “total cases” versus “total new cases”. Usually, the total is the sum of the values shown on the chart. Here they do mean “total cases” which is exactly as you describe.

Leave a Reply

Your email address will not be published. Required fields are marked *