One thing that I remember from reading Bill James every year in the mid-80’s was that certain topics came up over and over, issues that would never really be resolved but appeared in all sorts of different situations. (For Bill James, these topics included the so-called Pesky/Stuart comparison of players who had different areas of strength, the eternal question (associated with Whitey Herzog) of the value of foot speed on offense and defense, and the mystery of exactly what it is that good managers do.)
Similarly, on this blog–or, more generally, in my experiences as a statistician–certain unresolvable issues come up now and again. I’m not thinking here of things that I know and enjoy explaining to others (the secret weapon, Mister P, graphs instead of tables, and the like) or even points of persistent confusion that I keep feeling the need to clean up (No, Bayesian model checking does not “use the data twice”; No, Bayesian data analysis is not particularly “subjective”; Yes, statistical graphics can be particularly effective when done in the context of a fitted model; etc.). Rather, I’m thinking about certain tradeoffs that may well be inevitable and inherent in the statistical enterprise.
Which brings me to this week’s example.
This graph has been all over the web in the last few days–people loooove it–but from my perspective, it has some obvious, obvious flaws. As I suggested in a comment to Erik’s blog:
A simple dotplot (with occupations listed in order of their average “conservatism” of contributions, however this is measured) would be much better–at least for conveying the information. Better still might be a scatterplot of avg $ contributed vs. avg ideology.
Three’s also a couple of huge selection effects going on in the above graph:
1. 2008 was an unusual year in which Democrats received more campaign contributions than Republicans. By focusing on 2008, you’re drawing a misleading picture of the historical pattern of partisanship of contributions.
2. It’s not at all clear to me how the categories map to the colors, or what happened to the people who don’t fall into any of the occupation/industry categories shown.
By bringing all this up, I’m certainly not trying to slam Adam Bonica, an enterprising student who went to the trouble of making this graph for free for all of us! Bonica and others have the opportunity to take this forward from here, and that’s what research is all about.
At a more political-sciency level, there’s a lot in Bonica’s article and a lot more to look at. As Erik noted, following our discussion of Bonica’s mixing of occupations (“professors” with industries (“oil and gas”): “Further complicating things is that for research questions related to lobbying influence you would want to know about industries but if you want to make more sociological statements about ideology, then profession may well be the more appropriate category.”
I’d also refer interested readers (and researchers) to the work of Thomas Ferguson, who’s done some historical studies of campaign contributions in the U.S. since the 1930s.
What I want to focus on here, though, are questions of graphical presentation. From the standpoint of conveying information, I have no doubt that a dotplot or scatterplot would be far superior. Bonica’s graph looks cool but it is filled with distracting vertical lines, and the occupation/industry categories are extremely hard to read, with no sense of whether these categories are intended to be arbitrary, or exhaustive, or something in between.
On the other hand, if Bonica had simply made a dotplot or scatterplot, it wouldn’t have looked so cool to many, and it might well have been lost in the
P.S. As I said at the beginning, this topic–the tension between conveying information and grabbing attention–has come up again and again, most notably here.