## Is Statistics Good for Democracy?

Above is the title of the talk I’ll be giving this Wednesday 5:30pm at the Math and Democracy Seminar downtown. Statistics is what people think math is. So maybe they should be calling it a “math and statistics seminar.”

Anyway, here it is:

This talk will be some mix of News You Can Use and Big Idea. You can decide how much you want of each.

News You can Use: Latest developments in Bayesian workflow, MRP, and Stan.

Big Idea: Statistical methods can be used to better understand public opinion and political behavior. This should improve democracy by enabling and motivating politicians and government officials to be responsive to voters’ desires. But statistics can also be used to evaluate and develop methods for manipulating the public, and the belief in the efficacy of such manipulations can degrade faith in democracy. Open-source data and methods have the potential to equalize the playing field, giving small groups the ability to compete with well-funded institutions and political parties. But many of these methods are most effective with big data that are owned by major corporations. How to go forward?

1. Anoneuoid says:

This should improve democracy by enabling and motivating politicians and government officials to be responsive to voters’ desires.

Not necessarily, it depends what you do. If you optimize for the “average citizen” (who does not exist), that may not be a good outcome for any actual person in the population.

Also, there is the lowest common denominator effect. I’m thinking more of how “data-based decisions” is destroying software like firefox here, but the same should hold for political decisions.

• ” If you optimize for the “average citizen” (who does not exist), that may not be a good outcome for any actual person in the population.”

Well, if a “good” outcome goes up on average across a population, it is impossible for literally *all* the participants to get worse off…

It is possible for almost all of the participants to be worse off though. For example if abstract billionaire Jeff Gates increases his wealth by say 10 Billion while literally every single other person in the world loses exactly \$1, the average net worth will go up (because there are about 7B people and +10 – 7 is greater than 0)

• Anoneuoid says:

Not all, but every subgroup:

Well, if a “good” outcome goes up on average across a population, it is impossible for literally *all* the participants to get worse off…

• Yes, this kind of thing is possible, where when you look across certain groups it would look like “increasing x increases y” but in reality for all groups *causally* increasing x decreases y, it’s just that groups with higher x have higher y on average.

This is a reason for analyzing *causality* rather than correlation. But it’s impossible for a causal intervention that increases each subgroup’s y to result in a net reduction in y.

• Steve says:

But, what y is being maximized. There is no non-arbitrary welfare function. What ends up happening is that at any given point some one in power says these are the things that Americans care about. Let’s ask what they care most about? But maximizing that social welfare function may not have any connection with maximizing individuals’ welfare functions.

• I got this conversation confused with the other one on diet and metabolism and mixed the content of my replies…

Anyway, there are two questions here:

1) Simpson’s paradox as a question of prediction vs causality. You can have an X vs Y relationship that in each subgroup goes up, but across subgroups goes down… But if you increase each person’s X, the net result will be an increase in Y along the lines of the within-group trend, not a decrease. This is because *causing something to change* is different from *what you’d expect if you just knew that a person had a higher something*

If you estimate a causal relationship, and the estimate is accurate, then there is no simpson’s paradox. But to do this requires running experiments in which you cause changes.

2) As for realistic “social welfare functions” being impossible to calculate… Meh, we can do so much better than we actually in fact do these days, that the non-existence of a unique non-arbitrary welfare function is not really argument that we shouldn’t make decisions with reference to social welfare functions, it’s just evidence that we should work hard to construct our proxies. For example doing surveys, asking people what they care about, looking at how they actually spend their money, looking at how prices change in time and soforth… putting together some measures that most people agree are at least correlated with things they care about.

I’m reminded of “The Robust Beauty Of Improper Linear Models in Decision Making” https://psycnet.apa.org/record/1979-30170-001

Basically, if you did a survey like the ACS and added a question that asked people to name 10 measures of their well being that they cared about, and then pooled all ~10 million of them into a hat, and just selected 10,000 of these measures at random, divided them by the observed standard deviation across the population for each measure in the initial survey, and defined the sum of all of them as a “goodness” measure, then calculated them for each person, you’d have a very good indicator of well being. Even better would be something like 10 different random sets of 10,000 measures, and calculate each one to get an ensemble… You could ask this question through time, and define a method for kernel-weighted averaging through time, and you’d get a measure of social welfare that automatically adapted to changing desires. Maybe social welfare legitimately goes down because people have lots of buggy whips and rather quickly in the last 10 years they became obsolete and everyone wants a “motor buggy” now…

That we don’t do this thing is down to lack of imagination and statistical knowledge not because it’s a bad idea or impossible.

• Steve says:

I agree with your point 1 and your point on 2, but I am confused on why you don’t see a conflict between them. Your answer to the Simpson paradox question is that it doesn’t arise if we are really looking at a causal process. Your answer to the social welfare problem is we can make much better measures of social welfare. Those both seem true, but how can we identify a casual process involving a social welfare function when there is no actual non-arbitrary social welfare function, just lots of different ways of thinking about social welfare, no single one of them being the objectively true social welfare function. You are still always going to have the possibility from the original comment that we might maximize one arbitrary social welfare while destroying creating a bad outcome if social welfare is thought of differently. Maybe, you think that that theoretic possibility is de minimis, but think about various movements like fundamentalism in both the West and Muslim worlds. That’s a rejection of certain parts of modernity that most of use believe is good, but for some people think is horrible. I have no problem with arguing against the view of how society should be structured that fundamentalists of different ilk want, but I do have a problem with treating the disagreement as if it can be replaced by a technocratic data driven question. At some level that is not honest or respectful of the other side’s position. For example, I don’t treat not getting to own women as a loss on my conception of social welfare. Presumably, some of these fundamentalist do. No amount of data, and carefully constructed indexes of social welfare is going to resolve that difference.

2. jim says:

You shouldn’t say that statistics make democracy better. Statistics doesn’t matter to democracy one way or the other. Like anything, it can be used honestly or dishonestly, for better or for worse. How it’s used is in the hands of the user, not an intrinsic property of the methods or process.

• ken says:

Jim is right (above).

Democracy/voting is very simple bean-counting — no statistical analysis needed or helpful.

However, “democracy” itself is a very complex theory & practice… loaded with logical contradictions and pitfalls.

3. Yuling says:

Statisticians are likely to be more democratic than mathematicians for the use of Monto Carlo in lieu of quadrature?

• gec says:

Hah, very likely!

Related: Kruschke’s excellent textbook actually uses campaigning as an example to explain the Metropolis algorithm. Does the ambitious candidate stay in the same district or kiss babies in an adjacent one?

• Yuling says:

Well that is a sad point! The seemingly equal one-person-one-vote-democracy in Monte Carlo part is jeopardized by the endless lobbies and campaigns during the Markov chain and Metropolis step!

4. gec says:

My first stats teacher in high school began the class by saying, “you can’t be a responsible citizen of a democracy without understanding statistics.” (I agree and tell my own students the same.)

As Andrew and other commentators have noted, statistics is important for policy makers both because it provides ways to infer citizen attitudes AND because it offers ways to manipulate those attitudes or communicate in intentionally perplexing ways. Conversely, it’s important for citizens to understand stats both to know how to evaluate information for forming opinions AND to understand how that information can be used to manipulate them.

I suppose this falls under “openness” in the sense that it is making sure that everyone has the same knowledge of the rules of the game.

5. Davi Moreira says:

I suppose you will use some slides. Could you, please, share your presentation with us? Thanks in advance!