Two people pointed me to a study claiming fraud in the 2019 Indian election by the Bharatiya Janata Party, the current governing party of the country, and they also pointed me to criticisms of that study.
As a complete outsider to Indian politics, I found the claims and counter-claims hard to follow, so rather than trying to make any assessment, I’ll just pass along the concerns of my correspondents.
Ujjawal Krishnam writes:
Apparently there are some serious issues with the paper’s fundamental assumptions. India is a difficult country, objectivity seems less necessary or less desirable. Just last year there were manipulated reports against Zuck’s Meta in The Wire that precipitated a global saga.
There are some valuable criticisms of the data set here.
I asked Krishnam if he could point to a specific criticism he found helpful, and he linked to this, adding:
Gujarat 2017 Assembly Elections data follow a similar pattern, for instance here.
This is quite common in India, which has grown more politically polarised and unpredictable. It is this common sense that makes the paper controversial. Of course, the rift between Hindus and Muslims has deepened, but there are many other factors — inflation, caste alignment, unemployment, etc. — that influence such an outcome. Given the the author’s own conclusion, the paper’s title is clearly mischievous.
Krishnam also points to this op-ed by someone he characterizes as “a pro-Government columnist, but makes some important points.”
He also points to this discussion and a blog here which argues (a) that, yes the BJP outperformed in close elections in BJP-controlled states but it also outperformed in close elections in states controlled by other parties, and (b) one possible explanation is just that the BJP campaigned better, focusing its effort in close elections and “identifying the right candidate, getting the local caste math right, rallies by popular leaders, get-out-the-vote measures on election day and so on.”
Meanwhile, Samarth Bansal wrote this summary and adds:
Here are some questions – keeping them generic and avoiding those that require specific understanding of Indian election data. But these are important in themselves to set the right context and discuss processes.
1. The field of election forensics via statistics is non-existent in Indian political discourse. And in that sense, this paper is breaking new ground. So I want to introduce readers to this idea. Without getting into the specifics of this paper – can you share some context if statistical irregularities in voting data are considered legitimate to contest the authenticity of an election? Can we learn something from the experience in the US and other countries?
2. One of the things I found in the paper was it relies solely on the available data to test hypothesis and doesn’t account for qualitative/theoretical political analysis. One can argue that’s done to make the paper unbiased by solely relying on numbers. But one can also argue that numbers don’t capture everything, and in the process we loose a lot of context. For a claim as huge as electoral manipulation, how should we think about ignoring political dynamics whole writing a stats heavy paper?
3. Are the methods used in the paper statistically sound? I understand you don’t have context on Indian election, but would it be possible for you to comment on the statistical soundness of the paper? Does it read like rigorous work? (Point is to share with readers that irrespective of the conclusion, and disagreements someone may have about assumptions and methods, can we say if this is a good piece of research following the scientific process?)
4. The paper got into public attention thanks to the twitter thread. And the debate got murky. I feel that — just my view — that had it not blown up, Indian academics would have discussed what they think is strong or weak in the paper over email and conferences, as scientific knowledge is created and disseminated. But the blow up on social media bypassed that process – Most people only read abstract and a bit of conclusion which has some strong statements. and the nuance is being lost. What do you think? Is that the right reading? And in this information environment, what’s the additional responsibility of the researchers, if any?
He also writes, “Your paper on statistical significance became part of the discourse. (Prof Ravi is an economist and a member of the Prime Minister Economic Advisory Council.)”
Regarding question #1, I immediately think of the 2000 presidential election and the Palm Beach County (butterfly ballot) results. The statistical analysis seemed fairly conclusive, particularly when combined with the design of the ballot and the hanging chads. However, even with such a conclusive case, the idea that you should override the actual results seems questionable. So, my question would be: under what conditions would the statistical analysis be sufficient to question the authenticity of election results? Since you can always question anything (for examples, see everything regarding our last presidential election), I think “question” means something more, such as declaring the results illegitimate.
Re:
> can you share some context if statistical irregularities in voting data are considered legitimate to contest the authenticity of an election?
Speaking as an outsider to the community of statistical modeling, I can’t help but wonder if statistical tests such as McCrary need to be calibrated differently for multi-party democracies (i.e. what is normal may also be different enough, and may need to be understood properly, before certain patterns are declared outliers).
It seems that precise control is accepted as one benign explanation for a discontinuity. I wonder if there are more:
1. In the context of Indian general elections, it is not the case that all parties contest all the seats. Even the ruling party (and electorally the most dominant party in the last ~3 decades of Indian politics) only contested ~80% of the seats. Smaller regional parties contest seats in their regional strongholds. This might result in patterns that make all winners look disproportionately strong.
2. It is assumed that closely-fought seats should be a coin-toss. But that would only hold true at the limits of the vote margin (+- delta, delta -> 0). I can imagine the McCrary test requiring a smaller delta in cases where there is a dominant performance by one side. That would also result in sampling errors with a smaller number of samples falling within the += delta margin. It seems to be assumed that standard bandwidth selection algorithms and p-value results are a reliable guide here, but the math for those seems dense, and I can’t help but wonder if there’s some catches buried in there.
In my exploration of the datasets, vote share percentages of the winners and runner-ups seems to be linearly proportional (positively and inversely, respectively) to the victory margins, and did not have any “visual oddities”. But when subtracted, and put through a “mysterious black box”, they produced an anomaly.
makes a very important point.
India is not just the world’s largest representative democracy. It is also one of the most decentralised ones to hold “fair elections,” where politicking may overlap at various places: Panchāyata > State > Union. All this will throw up a high degree of multicollinearity and thus errors! But without thoroughly incorporating them, the possibilities that Das has tried to explore may never be accurately determined.
This is an interesting comment:
> India is a difficult country, objectivity seems less necessary or less desirable.
Particularly the less desirable part.
Ironically the backlash against this paper is in line with the title and does not bode well for academic freedom in India. The author (Sabyasachi Das) has apparently left Ashoka University.
This student newspaper article covers some of the issues at play: https://www.the-edict.in/post/debate-and-attacks-surround-sabyasachi-das-working-paper-on-democratic-backsliding
I have investigated multiple editorial irregularities in the Indian news media (NYT has now independently confirmed NewsClick’s SA/China State-sponsored newswire propaganda scam that I had originally reported). Interestingly, I am a witness in the criminal cases against The Wire and NewsClick news outlets: If that downgrades the overall perception of press freedom in India, will I be blamed? It must be noted, however, that I am also a journalist — and a critic of the ruling BJP.
The right to critique a piece of research and writing, therefore, is drawn under the ambit of “academic freedom,” too!
I do not see a dilemma. But then the question is, what is the political ideology of Elizabeth Holmes? I am not primarily interested, phew!
> Without getting into the specifics of this paper – can you share some context if statistical irregularities in voting data are considered legitimate to contest the authenticity of an election?
Wasn’t this part of the failed attempt to overturn the results of the 2020 US presidential election? Or rather, very poor attempts at statistical analysis of voting, involving versions of the lottery and/or gambler’s fallacy. There’s this [false claim](https://www.factcheck.org/2020/12/false-claim-about-bidens-win-probability/), for instance. See also [this video by Matt Parker](https://www.youtube.com/watch?v=etx0k1nLn78).
I think the problem with using this type of analysis for election manipulation is that, unfortunately, the public understand of statistics and statistical reasoning is very low, and so trust is placed instead on pundits who espouse various statistical claims when making arguments. These pundits themselves don’t understand statistics, so they’ll often parrot a spurious statistical claim without due investigation into its veracity. If they’re called out on it, and they stand their ground regarding the overall argument, public opinion on the matter tends to split based on pre-existing partisan views. The result is then a growing distrust in the use of statistics, which becomes viewed as either a rhetorical device that muddies the waters of an otherwise clear argument, or a tool used by opponents to denigrate the in-group’s views.
To then use statistical analysis to investigate fraud in a political process would at best be useless in convincing the side that “loses out” if fraud is proven, or at worst, a sign that sinister elite forces are using artifice to deny the rightful outcome of an election.
As a corollary of McCrary (2006), the density test for the parliamentary proceedings may also “provide strong evidence of manipulation.” For the time being, suppose that the Indian Lok Sabha is corrupt. Here, a three-line whip (or absence thereof) may be then read as a factor of “manipulation!” Y. Yi (2018) and Frölich (2017) discuss the covariates. But at this stage, these are purely theoretical questions.
So, I think, your apprehension is grounded in such a scenario.