## Interesting y-axis

Merlin sent along this one:

P.S. To be fair, when it comes to innumeracy, whoever designed the above graph has nothing on these people.

As Clarissa Jan-Lim put it:

Math is hard and everyone needs to relax! (Also, Mr. Bloomberg, sir, I think we will all still take \$1.53 if you’re offering).

My first instinct was to see if there was some sort of pattern there, like those pattern/puzzles you tried to solve as a kid and then I realized it was Fox

2. Sam says:

One would have to try and make this chart look that way it does, right? It cant just be incompetence..

• Luke says:

I’m still erring on the side of Hanlon’s razor. Maybe Matt Parker can find a good cause behind this perplexing mistake.

• jrkrideau says:

I do not know but it looks like it would be hard to do.

Graph with linear Y-axis

library(ggplot2)
score <- c(33,61,86,112,116,129,192,174,344,304,327,246,320,339,376)
ss <- 1:length(score)
dat2 <- data.frame(ss, score)
ggplot(dat2, aes(ss, score)) + geom_point(colour = "red") + geom_line()

• Zhou Fang says:

My guess at the sequence of events – the graph was created firstly with an unlabelled y axis, only with horizontal gridlines. At some point some honcho says “oh wait, you know it’d look deceptive if we don’t have axis labels, especially since the y-axis doesn’t begin at zero. So some intern has to go back and put in numbers where the gridlines actually are – but the gridlines were just an uniform pattern of lines chosen without any particular meaning, so the spacing between them is actually something like 33. Hence, result.

• Andrew says:

Zhou:

It can’t just be that. Look at the numbers 33, 61, 86, 112. On the graph, the slope from 86 to 112 is clearly steeper than the slope from 33 to 61. From the positions of the circles on the graph, you can also see that going from 86 to 112 is more than one full bar on the graph, whereas going from 33 to 61 is slightly less than one full bar. , but s an But 61 – 33 is 28, and 112 – 86 is only 26. So how can this possibly work out?

• Zhou Fang says:

That’s very subtle, good on you for catching that. That sort of mistake points to the graph being assembled manually in some kind of graphics/drawing package.

3. Robin Morris says:

ObXKCD https://xkcd.com/2289/

4. Janne Sinkkonen says:

It could be non-monotonic.

5. Seb says:

The lolgarithmic scale.

6. Charlie says:

Fox31 is a local TV station affiliated with the Fox Network, but the station itself is not owned by the Fox Corporation. So this screen grab of the graph isn’t from the Fox News Channel, the notorious cable channel owned by the Fox Corporation. The distinction is confusing, I know, but you shouldn’t use this graph to bash the Fox News Channel.

• Rahul says:

Close enough..

• Charlie says:

To belabor my point, it’s the equivalent of the connection between your independently-owned local ABC affiliate and the Disney Channel. Anyway, it’s a crappy chart no matter where it originally appeared.

• Mike H says:

I imagine a conversation took place between a producer who wanted a “data looking thing” and a graphic designer who they tasked with the job.

Producer: Why isn’t there 100?
GD: Well it’s not a multiple of 30.
Producer: But people like seeing round numbers. Can you get a 100 in there somehow?
GD: The graph doesn’t look as good when I scale it by 10.
Producer: Just throw 100 into the current graph and make the points sort of match.

I have been witness to conversations like this. This happens because people prioritize aesthetics over truthful representation of the data.

• Andrew says:

Mike:

Yes, see for example Figure 11 of this paper. That particular visualization won an award!

• Martha (Smith) says:

Andrew: FYI — The link gives me the message “Not Secure”.

• Andrew says:

Martha:

Yeah, I think it does that for websites that are http rather than https.

• Jeff says:

Mike’s scenario is more or less what I was thinking. Everything suggests manual construction by a non-data person.

Even ignoring the labels on the grid lines, the vertical placement of the data points looks off. 192-174=18 and 344-304=40, but the vertical gaps between those pairs of successive points are about the same. It’s possible the trend was constructed and then it seemed easier to change the grid line labels than it was to move the dots up and down and fiddle with the connecting lines between them.

• Dzhaughn says:

To emphasize this point, it was someone at the Seatle Fox affiliate that put the “deep fake” of Donald Trump licking his chops on the air.

7. Chris Mecklin says:

I have to bookmark this one for my intro level Statistical Reasoning course this fall when we cover misleading graphs.

8. Nathan Nguyen says:

The last value on the X-axis also looks interesting

9. Anoneuoid says:

The intervals are: 30, 30, 30, 10, 30, 30, 30, 50, 10, 50, 50, 50

I think this was made point-by-point in something like ms paint without looking ahead.

– First the numbers increased by about 30, so they used 30.

– Then the next (112) was kind of close to 100 which is a *great* double round number to have on your chart so they put that one in probably figuring the numbers represented something like percentages so that was the maximum (sometimes you hear of percentages over 100 after all).

– Then they returned to the 30 pattern for a bit using the 100 as a baseline (which is similar to zero). But unfortunately (for their method) they got a number way higher than 200 (344), so when they had to add more lines they figured they should increase the interval to 50. Ie, at this point the y-axis went from 190 to 350.

– The next number didn’t increase a lot like they expected, it decreased by about 40 to 304, but that still fit with the 50 interval so great.

– Then 327 was halfway between the last two so ok, no new line needed.

– But then a huge drop to 246. At this point the y-axis went straight from 190 to 300. Should they add 50 to the 190 or subtract from the 300? Fuck it add them both.

– Then the last 3 basically fit with the 50 interval near the top.

• Andrew says:

Anon:

I guess the good news is that these incompetent people are just working for some local TV station, it’s not like they’re running the country or anything . . .

10. Dan Bowman says:

In the faint-glimmer-of-hope department, it does look like the last data point is for April 1….

11. jim says:

Well if we’re trashing on data viz, all the COVID-19 maps get a D- for:

1) switching the administrative entity level from day to day
2) mixing administrative entity levels on the same map (comparing cities, counties and countries on the same level)
3) using poor relative resizing so that day after day after day the relative sizes of circles for WA and NY are indistinguishable day to day, even as NY went from 500 to 5000 fatalities.

IMO automated data viz and data viz research as supreme authority hasn’t yielded much over what common sense would tell us and frequently delivers bad results.

12. Kaiser says:

Another problem is the sum is 3158 not 3342. The gap of 184 can’t be explained. The daily rate is below 100 on the left end and above 300 on the right end of this time window.
Still mystified by the purpose of the manipulation. The person who created it clearly spent time messing around with it but why?

• Carlos Ungil says:

The sum is 3159 and the gap of 183 can be easily explained: https://www.denverpost.com/2020/03/17/colorado-coronavirus-death-weld-county/

• Kaiser says:

Is that sarcastic? The link said the total cases was 183 on that day, not new cases. In any case, it’s not as easy to explain how the numbers on the chart sum up to 3159 but the total printed on the top right is 3342. The designer would have to be looking at two inconsistent data sources, and why would anyone do that for a simple line chart?

• Carlos Ungil says:

I don’t know where do you see the difficulty, honestly. The total number of cases was 183 as of March 17 according to that article. If you now add the new daily cases given in the chart (33 on March 18, 61 on March 19, …, 376 on April 1) you can calculate the total number of cases as of April 1… and it happens to be 3342.

• Kaiser says:

Thanks for that. I see there is yet another potential for confusion: “total cases” versus “total new cases”. Usually, the total is the sum of the values shown on the chart. Here they do mean “total cases” which is exactly as you describe.

13. Kaiser says:

Mystery of the universe solved. I even made a gif about it. Look here. You can replicate this chart-making technique.

• Anoneuoid says:

But why would someone do that at those dates?

• Carlos Ungil says:

You may be missing an additional point-adjusting step: in your version the 304 ended in the wrong side of the 300 (née 320) line.

And the mystery of how to create the subcharts remain: for example, why do you put the 116 point much closer to the 100 line than to the 130 line?

14. Kaiser says:

Great points :)
Here is the blog post. So far, no one has solved the mystery: why go to so much effort for so little distortion?