When plotting all the data can help avoid overinterpretation

Patrick Ruffini and David Weakliem both looked into this plot that’s been making the rounds, which seems to suggest a sudden drop in some traditional values:

Percent who say these values are 'very important' to them

But the survey format changed between 2019 and 2023, both moving online and randomizing the order of response options.

Perhaps one clue that you shouldn’t draw sweeping conclusions specific to these values is that there is a drop in the importance of “self-fulfillment” and “tolerance” too. Weakliem writes that once you collapse a couple response options…

there’s little change–they are almost universally regarded as important at all three times. The results for “self-fulfillment,” which isn’t mentioned in the WSJ article, are particularly interesting–the percent rating it as very important fell from 64% in 2019 to 53% in 2023. That’s hard to square with either the growing selfishness or the social desirability interpretations, but is consistent with my hypothesis. These figures indicate some changes in the last few years, but not the general collapse of values that is being claimed.

If the importance of everything drops at once, this might be a clue that selective interpretation of some thematically-related drops is likely not justified — whether this is because of survey format changes or otherwise (say something else becoming comparatively more important, but not asked about).

So perhaps this is a good reminder of the benefits of plotting more of the data — even if you want to argue the action is all in a few of the items. (You could even think of this as something like a non-equivalent comparison group or differences-in-differences design.)

Update: Here is a plot I made from the numbers from the Weakliem post. In making this plot, I formulated one guess of why the original plot has this weird x-axis: when making it with a properly scaled x-axis of years, you can easily run into problems with the tick labels running into each other. (Note that I copied the original use of “’23” as a shortening of 2023.)

Small multiples of WSJ/NORC survey data

[This post is by Dean Eckles.]

8 thoughts on “When plotting all the data can help avoid overinterpretation

    • Indeed. I think one way to think about these plots is as something like a more visual version of a table (given that all the numbers are printed as well.)

      I can also imagine the person making the original plots might have encountered problems fitting the axis tick labels in when using a linear scale. Certainly that happened when I produced my version.

        • There still seems to be an issue with the “y-axis values are higher than in the original” that Anon pointed out. For example in the original, Patriotism went from 70% to 61% to 38% but in the revised the values are (approximately) 92% then 89% then 72%

        • Yes, I didn’t make this super clear, but that’s intentional in that the version I’m plotting pools the top two response options, something Weakliem explores because of concerns about survey format and randomized response order.

  1. Are those arrowheads some sort of new symbol that I’m not familiar with? Maybe it means “nose-diving” or “skyrocketing” when one uses arrowheads if the value is out of the limits of the y-axis? Cool! I’m gonna start using those on my plots but improving them by scaling the size of the arrowhead by the amount of nose-diving and skyrocketing involved.

Leave a Reply

Your email address will not be published. Required fields are marked *