Statistical fallacies as they arise in political science (from Bob Jervis)

Posted on March 3, 2021 9:30 AM by Andrew

Bob Jervis sends along this fun document he gives to the students in his classes. Enjoy.

Theories of International Relations

Assume that all the facts and assertions in these paragraphs are correct. Why do the conclusions not follow? (This does not mean that the conclusions are actually false.) What are the alternative explanations for the facts? What tests would tend to confirm or disconfirm these explanations? There are no tricks here and only in question 1a is the specific wording crucial. Numbers 19 and 22 are of a different character in that they do not involve specific fallacies but should provoke thought.

1. Doctors have found that in patients with a specified set of symptoms a certain kind of spinal fusion operation produces a success rate of 72% (as measured by the patient’s assessment that he or she feels much better a year later). Therefore if you have those symptoms you should have the operation.

1a. Today older people tend to be more politically conservative than younger ones. The explanation lies in the aging process: older people are more set in their ways and have more to lose by social change.

2. There is a positive correlation between the per capita GDP of a country and the degree to which it is democratic. Therefore as the poor countries get richer, they will also become more democratic.

3. Taking a random sample of wars, we study debates within each country that preceded the decision to fight and find that in the majority of cases the decision-makers, both civilian and military, were overoptimistic about the chances of victory. That is, they usually thought they would win and even if they did not, they thought they would do better than they actually did. From this I conclude that wishful thinking exists and is a cause of wars.

4. Taking a sample of Republican leaders, Republican voters, Democratic leaders, and Democratic voters, I compute the average Liberal-Conservative score for each group and find that while the Republican leaders are much more conservative than the Republican voters, the Democratic leaders are only slightly more liberal than the Democratic voters. From this I conclude that the individual Democratic leaders are moderates (in the sense of being close to the middle of existing political spectrum).

5. Many theorists claim that domestic instability tends to lead to foreign aggression. Others have made the opposite claim. The posited linkages are obvious. Assume that I develop a good measure of both variables. For each year I compute the total amount of domestic instability in all countries in the international system and correlate this with the total amount of external aggression by all states. I find no correlation at all and conclude that, contrary to both theories, there is no connection between domestic instability and war. (There are at least two major related fallacies here.)

6. To test Galtung’s proposition that status incongruity (i.e. being simultaneously high on some dimensions of international status and low on others) leads to aggressiveness, I correlate a measure of each state’s status incongruity (whose validity is assumed to be correct) with the number and intensity of wars it has been in. I find no correlation and conclude that proposition is false.

7. In explaining the origins of World War II, A.J.P. Taylor is correct in making almost no reference to Hitler’s extermination of the Jews. He is not concerned with making moral judgments about Hitler nor is he arguing whether or not the Allies should have made war on Hitler for the sake of those in Germany and the occupied territories. All he is trying to do is explain how and why the war started, the degree to which Hitler was unalterably aggressive, and the extent to which other countries share responsibility for the war. For questions such as these Hitler’s racial policies are irrelevant. Without endorsing Taylor’s answers to the questions he has set himself, or claiming that those questions are the most important ones, we can see that he was right to avoid being drawn into a discussion of Hitler’s domestic policies.

8. In discussing the balance of power system, one author argues: “A system containing merely growth-seeking actors will obviously be unstable; there would be no provision for balancing or restraint.” (Donald Reinken, “Computer Explorations of ‘Balance of Power'”, Morton Kaplan, ed., New Approaches to International Relations, p. 469).

9. Finding generalizations is much less important than is usually thought. Although they are useful for description, they do not help to explain anything. Knowing that X happens often, or even that it always happens, does not help explain why X occurs.

10. To study whether wishful thinking (i.e. the distortion of perceptions by desires) is a cause of crises within an alliance I take several alliances and examine a random sample of crises (defined as a sudden and unexpected occasion in which the partners find themselves in disagreement over an important issue). In all cases I find that each partner had expected the other to act in support of, or at least acquiesce in his policy. I therefore conclude that empirical evidence shows wishful thinking to be a major cause of intra-alliance crises.

11. It has been found that psychiatrists have the highest suicide rate of all occupational groups. This finding is explained by the proposition that becoming involved with other people’s problems creates strains that often cannot be handled even by a person with professional training. (Assume that we have an accurate measure of suicide rates.)

12. Assume that we have good measures of the success of policies and the amount of nongovernmental advice solicited. We find a strong negative correlation between these two measures. I conclude that the more the President listens to outside “experts” the worse off he will be.

13. Most Washington lobbyists say that they exert significant influence over the outcome of Congressional votes. On a series of close Congressional votes I ask Congressmen if they were influenced by lobbyists. The percentage answering “no” varies from 95 to 98. (Assume that Congressmen know what influences them and are telling the truth.) I conclude that, at least in the cases I have studied, lobbyists have little influence. (There are at least two major fallacies. Don’t stop with the easy one.)

14. Since every war has a loser, we can deduce, without even having to examine the pre-war debates, that in at least half of the cases the nation’s leaders overestimated their chances of victory.

15. What inferences about discrimination can we draw from the fact that the average batting average of blacks in the major leagues is higher than that of whites? Or from the finding that within the group of science Ph.D.s who have had research and teaching jobs for at least three’ years, women produce good scientific papers at a higher rate than do men? What about the following argument: we take a sample of male and female sales personnel and management employees and find that the women “are at least as reliable, somewhat less complacent, and somewhat more sociable. Women are a bit more impulsive than men, and certainly do not trail men in their energy level or willingness to work.” From this we conclude that

These (findings) clearly destroy many of the myths relating to sex difference in effective work potential and demonstrate that the under-representation of women in responsible jobs reveals that the process by which people are admitted to these jobs unfairly discriminates against women. (L.A. Times, Sept. 4, 1974, part 1A, p. 45)

16. An ad for a psychological biography of Nixon asked, “Did the bombing of Hanoi begin at the playing fields of Whittier?” What kind of evidence would be needed to confirm or refute the claim that Nixon’s policy in Vietnam is best explained by his personality?

17. Psychological theories yield several related propositions about the effects of tensions and crises. Because of time pressures, limitations on information channels and information processing abilities, and emotional strains, we expect that in a crisis: 1) decision-makers will tend to perceive the range of their own alternatives to be more restricted than those of the other side; 2) the search for one’s own alternatives will become increasingly narrow; 3) as the crisis develops, more and more information is flatly rejected, and 4) dissenters are increasingly excluded from the centers of decision. To confirm or disconfirm these propositions I plan to study several crises that led to wars (e.g. July 1914, August 1939). In order to provide the necessary comparisons, I will also study several crises that were resolved peacefully (e.g. Cuban Missile Crisis, Munich, Fashoda).

18. “The relevance of [Gerhard Ritter’s discussion of Allied war aims in World War I to his] history of militarism in Germany is not easy to detect. One of Ritter’s criticisms of Fischer was that he failed to talk about other people’s war aims. This criticism only means that no German historian should say that Germany behaved badly without also showing that other nations behaved worse. Ritter shows here that the Allies’ policy was at least no better than the Germans’.” (Norman Stone, “Gerhard Ritter and the First World War, ” in Historical Journal, vol. 13, 1970, p. 161).

19.

Explanations in international relations, and in the social sciences in general, are different from those in the physical sciences because they fail to perceive the essential difference from the standpoint of causation, between a paper flying before the wind and a man flying from a pursuing crowd. The paper knows no fear and the wind no hate, but without fear and hate the man would not fly nor the crowd pursue. If we try to reduce it to its bodily concomitant we merely substitute the concomitant for the reality expressed as fear. We denude the world of meaning for the sake of a theory, itself a false meaning which deprives us of all the rest. We can interpret experience only on the level of experience. (R.M. Macliver, Society, p. 530).

This means that an understanding of international relations requires that we reconstruct the values, emotions, and calculations of decision-makers. The only way to explain their behavior is to see the world the way they saw it.

20. To determine whether external or internal variables are a more important source of foreign policy, I measure some national attributes, e.g. size, level of political and economic development, nature of the regime-some characteristics of dyads, e.g. geographical distance, similarity or difference of regimes, similarity or difference of power-and, as the dependent variables, events data on conflict and cooperation for each state and dyad. I propose certain hypotheses about the effect of each relationship and national attribute on the amount of conflict behavior it engages in; the closer the members of a dyad, the greater the conflict. (For the purposes of this exercise, the content of these hypotheses doesn’t matter). When I compare the actual correlations with the predicted ones, I find a much closer match for the propositions involving national attributes than for the ones involving relations. From this I conclude that internal factors are more important causes of foreign policy than are external factors. (There are at least three fallacies here.)

21. To see whether the amount of conflictual behavior that a state initiates is inversely related to the amount of cooperative behavior it initiates, I group countries according to national attributes (e.g. see above). I find that there is a direct relationship. Those kinds of states that initiate a lot of conflict (e.g. large, developed, powerful ones) also initiate a lot of cooperation. I explain this finding by the argument that to maintain even minimal order in the international system (and the data is gathered from a period of relative peace), nations cannot be totally hostile to each other. If peace is to be kept, a nation that initiates a lot of hostile behavior toward another must also initiate a significant amount of cooperative behavior toward it. As the data show, nations do not aim undiluted hostility at each other. (There are at least two fallacies here.)

22.

“The historian need not and cannot (without ceasing to be a historian) emulate the scientist in searching for the causes or laws of events. For science, the event is discovered by perceiving it, and the further search for its cause is conducted by assigning it to its class and determining the relation between that class and others. For history, the object to be discovered is not the mere event, but the thought expressed in it. To discover that thought is already to understand it. After the historian has ascertained the facts, there is no further process of inquiring into their causes. When he knows what happened, he already knows why it happened…

The value of generalization in natural science depends on the fact that the data of physical science are given by perception, and perceiving is not understanding. The raw material of natural science is therefore ‘mere particulars’ observed but not understood, and taken in their perceived particularity, unintelligible. It is therefore a genuine advance in knowledge to discover something intelligible in the relations between general types of them. What they are in themselves, as scientists are never tired of reminding us, remains unknown: but we can at least know something about the patterns of facts into which they enter.

A science which generalizes from historical facts is in a very different position. Here the facts, in order to serve as data, must first be historically known; and historical knowledge is not perception, it is the discerning of the thought which is the inner side of the event. The historian, when he is ready to hand over such a fact to the mental scientist as a.datum for generalization, has already understood it in this way from within. If he has not done so, the fact is being used as a datum for generalization before it has been properly ‘ascertained’. But if he has done so, nothing of value is left for generalization to do.

If, by historical thinking, we already understand how and why Napoleon established his ascendancy in revolutionary France, nothing is added to our understanding of that process by the statement (however true) that similar things have happened elsewhere. It is only when the particular fact cannot by understood by itself that such statements are of value.” (R. C. Collingwood, The Idea of History, p. 214, 2223)

Both historians and political scientists should be able to agree that Collingwood is right. There may be universal laws and generalizations. If they exist, they are to be found through the cumulation of case studies. If we understand several cases, we can see what they have in common and how they differ. But each case must be understood in its own terms, by examining it in detail in its own context. We cannot learn why an outcome occurred in one case, or why an actor behaved as he did in one instance, by looking at other cases. These comparisons come only after we have explained each case. For how could we explain one event or one problem by comparing it to others? This might tell us if the case was unusual or if we could construct a valid generalization, but it could not help explain the case itself. Since the causes of any outcome obviously lie in the preceding events, looking elsewhere is at best a distraction.

Thus, for example, the way to discover the impact of the frontier on American life and politics is to intensively study the American frontier itself-what life was like there, what were only myths, and what patterns were common. Turner’s research was skimpy and his conclusions may be incorrect, but his general approach was surely the proper one. Similarly, it would be foolish to try to explain American foreign policy by looking at the foreign policies of other countries. For example, it is foolish to try to refute the revisionist arguments about American policy after World War II by comparing it to the policies of Russia and of the European states.

Furthermore, it is usually a basic intellectual error to try to find one explanation that can cover several cases. Even when the outcome is the same-e.g., American intervention abroad-the causes often differ from case to case.

23.

“White prejudice and any specifically Negro characteristics account for much less of the difference in employment rates between Negroes and whites than would otherwise appear.” This is shown by the fact that when we look at what can be called “‘the Statistical Negro’-that is, the Negro when all non-racial factors (e.g., education, urban-rural residence, etc.) have been controlled for-is a very different fellow from what will be called the Census Negro. In some respects the Statistical Negro is indistinguishable from the white, and in all respects the differences between him and the white are smaller than those between the Census Negro and the white.” Thus whatever the effects of previous discrimination, current discrimination is relatively unimportant. “If overnight Negroes turned white, most of them would go on living under much the same handicaps for a long time to come.” (Banfield, The Unheavenly City, p. 6973)

Think about this argument, and particularly the claim for the importance of comparing the achievements (e.g. income levels) of the Statistical Negro to that of whites.

24.

“The strongly adverse relation between cigarette smoking and health led to the banning of cigarette advertising on television. Since television advertising of cigarettes was discontinued, sales have not been noticeably affected. With the awareness that the money previously spent on television advertising was seemingly wasted, it is not immediately obvious why the tobacco industry continues to advertise at all. Knowing the intensity of addiction experienced by most smokers, it is probably not necessary to convince them that they should smoke. Indeed, most regular smokers find it very difficult not to smoke and certainly don’t need encouragement to continue. Yet, the tobacco industry continues to advertise heavily.

If the money spent on television advertising was useless, why continue the same practice in the printed media? What is the tobacco industry getting in return for their investment? One return is the promotion of the notion that smoking cigarettes is a matter of user’s choice and not an uncontrollable addiction. A more disquieting possibility is that this investment serves as hush money, softening the telling of how bad the story of smoking versus health really is.”

William Oldendorf, “Cigarette Advertising”, Science, vol. 184, April 10, 1974, p. 112.

25.

“Parolees…do little better in the community [as measured by the recidivism rate] than those who are not paroled [and serve out their full sentences], which suggests that ‘discretionary release’ is really potluck, and those who decide who gets paroled have only the sketchiest idea of who has been ‘rehabilitated’.” (Tom Wicker, “The Lessons of Parole”, New York Times, March 8, 1974, p. 33)

26. 75% of cars that are stolen had been left unlocked. Therefore locking your car will reduce the chance that it will be stolen. (If this gives you trouble, 40 is similar and easier.)

27. In studying the factors that lead a state to conclude that others are a potential threat to it, I examine a number of cases where states have come to see others as a menace. In almost all these cases I find that the state seen as threatening had broken a “rule of the international relations game”. From this I conclude that if one state breaks a “rule of the international relations game”, others will see it as a threat. (Do not worry about the vagueness of the idea of “rules of the game”.)

28. I find that harsh peace treaties are usually followed by long periods of peace whereas soft treaties (i.e. those in which the winner does not take a great deal from the loser) usually lead to new wars quite quickly. I therefore conclude that I can tell decision-makers of countries that win a war: “The best chance of ensuring that the peace will last is to be very tough and force the other side to accept harsh terms.”

29. In disputing the argument that the Soviet Union consolidated its hold over East Europe not because she sought to expand as far as possible, but because she wanted to guarantee her own security against Western attack, one scholar points out that at the same time Russia was also encouraging secessionist movements in China, moves that cannot be explained by the desire for security. Is this line of argument legitimate?

30. Examining a random sample of wars, I find that the side that initiates the fighting (assume that we have solved the obvious empirical problem this involves) usually loses the war. From this I infer that it is usually politically and/or militarily disadvantageous to strike the first military blow.

31. In order to investigate the causes of Soviet armed intervention in East Europe, I look at the relevant cases: East Germany in 1953, Hungary in 1956, Czechoslovakia in 1968, and perhaps Poland in 1981, and find that each time the local Communist Party was losing control of the situation. I therefore conclude that the Soviets were very likely to intervene whenever their client parties are unable to keep the situation in hand.

32. In order to determine the proportion of cases in which a state is able to achieve military surprise, I look at a random sample of cases of the initiation of war. I find that surprise occurs in almost all of them. From this I conclude that most cases of attempted surprise succeed.

33. Studying the causes of all or most of the wars in international politics is fairly foolish: what we are most concerned with are wars which have very great consequences. Therefore we should mostly-if not only-study the causes of great wars such as World War I or World War II. (There are several problems here.)

34. In order to test the proposition that changes in the power relations among the leading states (what are often called “power transitions”) is an important cause of major wars, one should look at the major wars that have occurred and see whether they were preceded by the posited power changes.

35. In order to reduce the amount of time I am likely to have to spend on the phone waiting to speak to someone who can make my airplane reservations, I plan to place my call at a time of day that few others are likely to be calling. (There are at least two problems here.)

36. The USSR was able to gain many more spies in the West than the latter was able to place in the USSR. The explanation must be either that Communism had greater ideological appeal in the West than capitalism and Western democracy did in the USSR or that the Soviets were willing or able to pay a great deal more for secrets than the West was.

37. According to the New York Times (June 4, 1997), a study conducted by the United Negro College Fund found that “contrary to the widespread belief that black students are a dominant presence in urban public schools, less than one-third of black public school students attend schools in large cities.” (There is no assertion of causality here, but what is wrong with this sentence as a descriptive statement?)

38. In trying to support the claim that the US sponsored the coup in Iran in 1953 because of anti-Communism, not because of the desire to gain a share of the oil fields, one scholar notes that

The Cold War was at its height in the early 1950’s and the Soviet Union was viewed as an expansionist power seeking world domination. Eisenhower had made the Soviet threat a key issue in the 1952 elections, accusing the Democrats of being soft on communism and of having “lost China.” Once in power, the new administration quickly sought to put its views into practice: the State Department was purged of suspected communists, steps were taken to strengthen the Western alliance, and initiatives were begun to bolster the Western position in Latin America, the Middle East, and East Asia. Viewed in this context, and coming as it did only two weeks after Eisenhower’s inauguration, the decision to overthrow Mossadeq appears merely as one more step in the global effort of the Eisenhower administration to block Soviet expansionism. (Mark Gasiorowski, “The 1953 Coup D’Etat in Iran,” International Journal of Middle East Studies, vol. 19, September 1987, p. 275)

Do you find this way of reasoning legitimate and persuasive?

39. To determine the causes of wars, I look at a random sample of wars, examining in detail the domestic, bureaucratic, and international factors that seem to be involved and from these results build a general theory about the relative importance of these influences.

40. Since most automobile accidents occur in trips of 5 miles or less, I should substitute long drives for short ones whenever possible.

41. In his famous 1954 Foreign Affairs article enunciating the massive retaliation doctrine, John Foster Dulles said that “a potential aggressor must not be left in any doubt that he would be certain to suffer damage outweighing any possible gains from aggression.” Why is this neither necessary nor sufficient for deterrence?

42. To test the argument that the main sources of US weapons procurement policy lie in the outlooks and preferences of the armed services, I look at the weapons the US has bought over a period of years and see if they correspond to the services’ desires.

43. In “Toughen the Will and You Toughen the Mind,” Andrew Revkin reports (New York Times, July 21, 1997) on the effect of an Outward Bound program for inner-city teenagers:

87 percent…who participated in the…program either had graduated from school or were still attending, compared to an overall graduation rate of less that 40 percent at the school. Half the participants have gone to college…. Reading scores rise more than half a grade, and math scores even more.

The…teenagers were recruited…from ninth and tenth graders who scored in the bottom third of their class on literacy tests. More than two dozen were invited to try a three-day hike in the Catskills in May, but only 12 took up the offer. Now nine remain.

What inferences can one draw about the influence of this program on various categories of teenagers?

44.

Policy-maker: “If you scholars are good for anything, you should be able to tell me what policy instruments are likely to work under what circumstances. Can you? I need to know in order to guide me in what I should do in the future.”

Eager scholar: “Yes, sir. I will examine the outcomes of a random sample of cases in which the US used economic pressure and compare the results with those that occurred in a random sample of cases in which the US used force.”

Would this meet the policy-maker’s requirements? What inferences could be drawn from this study? How would you design a better one?

45. A graduate program that believes it has greatly improved its quality over the past 5 years is shocked to find that yield (the percentage of those accepted into the program that actually enroll) has declined, not increased. Does this show that the program’s reputation is lower than it was before? Would the inference be different if the yield at peer institutions had increased? declined? remained steady?

46. If I have a serious heart disease and want the best treatment, I should select the hospital that is the best as measured by the available statistics showing its rate of success in dealing with this disease.

47. Many HMOs offer to pay for health club memberships for those who join. The reason is that they want to encourage people to exercise and so stay healthy. (This is tricky. The statement may be correct, but what other–perhaps stronger–reason would there be for HMOs to make this offer?)

48. About 30 years ago, Brown University radically changed its curriculum by drastically reducing its requirements. Since then, its graduates have achieved much greater success after they graduate (assume the validity of the measures employed). This shows that the students learned much more from the new curriculum than from the old one. (There are at least two fallacies here.)

49. Everyone tells me that Professor Nit is a hard grader whose class is very challenging and Professor Wit, who teaches the same course, is an easy grader. But through a friend at the Registrar’s office I have seen their grade sheets and the distribution of grades is the same. So the rumors must be incorrect.

50. “Smoking increases your chances of lung cancer by 900%.” That is all you have to know to conclude that you shouldn’t smoke.

51. “65% of the deaths in accidents involving SUVs are due to rollovers, whereas only 22% of the deaths in car accidents come from this cause.” (NBC Nightly News, 9/20/00.) From this we can infer that SUVs are much more prone to rollovers than are cars.

52.

The dozen states that have chosen not to enact the death penalty since the Supreme Court ruled in 1976 that it was constitutionally permissible have not had higher homicide rates than states with the death penalty, government statistics and a new survey by the New York Times show.

Indeed, 10 of the 12 states without capital punishment have homicide rates below the national average, Federal Bureau of Investigation data shows, while half the states with the death penalty have homicide rates above the national average. In a state-by-state analysis, The Times found that during the last twenty years, the homicide rate in states with the death penalty has been 48 percent to 101 percent higher than in states without the death penalty.

The study by The Times also found that homicide rates had risen and fallen along roughly symmetrical paths in the states with and without the death penalty, suggesting to many experts that the threat of the death penalty rarely deters criminals. (Raymond Bonner and Ford Fessenden, “States with no Death Penalty Share Lower Homicide Rates,” New York Times, Sept. 22, 2000)

Why does this not show that the death penalty fails to deter?

53. SUVs have a rollover rate (calculated as rollovers per 100,000 miles traveled) 3 times the rate of cars. From this we can infer that they must be less safe than cars. (There is one obvious fallacy here; once you have found it, look for 2 other deeper fallacies.)

54. In the wake of the Firestone/Ford tragedy, Congressional committees and newspapers will try to explain what happened and cast blame by examining the internal documents in the companies about this case. What the problem with proceeding in this way?

55. Public opinion polls revealed that most people oppose the impeachment of President Clinton. The behavior of the members of Congress who strongly pushed for impeachment therefore shows the weakness if not inaccuracy of the claim that politicians seek to maximize their chances of re-election.

56. Scholar A:

“Realism should predict that the strongest state will prevail in a crisis, and, for the Cold War, the only real dispute is over whether we should expect the conventional or the nuclear balance to be most important.”

Scholar B:

“No, Realism predicts that as long as the situation approximates the game of Chicken, the state with the stronger reputation for resolve or with the greater stake in the issue should prevail.”

What is wrong with both these claims?

57. “According to rational choice theory, a state will fight if the expected utility for going to war is greater than the utility of the status quo.” Why is this statement incorrect?

58. Gore won the popular vote in the 2000 Presidential election. It follows that he would have been elected President had there been a previous change in the Constitutional eliminating the Electoral College and replacing it with a popular vote.

59.

“Most students involved in school shootings discussed their plans beforehand and did things that could have telegraphed the attacks, two Secret Service agents said.” (Judith Cohler, Associated Press story, July 18, 2001)

From this we can infer that wise public policy would be to act on these warning signs, immediately calling in for questioning students who display them.

60. It is striking how often borders between states of very unequal power are quite peaceful (e.g. US-Canada, France-Belgium), while conflict is more common when the neighbors are of roughly equal power. From this I can infer that rough equality of power is more conducive to conflict than is a very unequal distribution.

61. “The purpose of this book is to measure the capabilities of democracies in the realm of foreign policy by looking at the politics and institutions of two of the oldest and most prominent of democratic states.” (Kenneth Waltz, Foreign Policy and Democratic States, p. 1.) In fact, this does not describe what the book does, which is to compare the foreign policy capabilities of Great Britain and the US. But if the sentence did give the book’s purpose, it would fall into 2 methodological traps.

62. The proper way to conduct a post-mortem on why the US was taken by surprise by the terrorist attacks of September 11 is to go back over the information that was or should have been available to the CIA and FBI and ask whether this was sufficient to have enabled a reasonable person or organization to have inferred that this attack was quite likely.

63a. “We were debating whether to go to war with a particular country and I thought I had won the argument when I was able to convince my boss that the chances of victory were clearly greater than 50 percent.”

63b. “Being wiser than I was in the previous case, I was sure I had won the argument when I showed my boss that, taking everything into account, the expected utility of starting the war was greater than the value of the status quo.” Why might this not be a winning argument?

64c. “OK, this time I’m sure I’ve got it right. In this case, I was able to show that the expected utility of fighting was less than the value of the situation as it is today. I was sure that this would mean that no serious person could argue for fighting. But I was wrong yet again.” Why?

65. Most international agreements are complied with. This disproves the common argument that difficulties in ensuring compliance explain why cooperation is difficult to develop and sustain in international politics.

66. To study the effects of whether a mother is employed outside the house on a child’s achievements and adjustment (assume that I can measure these), I need not only to look for the overall correlation, but to use control variables in order to establish causation. Most importantly, I want to see if any relationship I find remains after I hold constant the income of the mother and the family.

67. “Every known human carcinogen causes cancer in animals.” It follows that we should test all chemicals on animals for their carcinogenity and refuse to release any that fail the test. (Mount Sinai Center for Children’s Heath and the Environment, “She’s the test subject for thousands of toxic chemicals. Why?” New York Times, August 15, 2002.) (There are at least two problems here.)

68. The fact that the US was able to keep the USSR out of West Europe without a war shows the efficacy of the policy of deterrence.

69. Federal states like the US and the former Yugoslavia are more likely to have civil wars or dissolve than are unitary ones. The obvious lesson to those who are writing constitutions is to avoid a federal system.

70. High school dropouts on average earn $9,000 dollars less than those who complete high school. It therefore should be a major objective of public policy to decrease the number of dropouts. (There are two fallacies here).

71. There is a strong correlation between the extent to which a state is democratic and the extent to which it respects human rights. I infer that to protect the latter I should facilitate the former.

72. “Under Mayors Giuliani and Bloomberg crime in New York has significantly decreased. The obvious reason is the policing tactics they have adopted.” What are the 2 obvious sources of information that you could tap to judge the plausibility of this argument about causation?

73. I look at recent cases of attempted and successful revolution and find that most instances in which there was little if any violence were successful and that, by contrast, most cases in which there was significant bloodshed ended with the regime staying in power. From this I infer that rebels should use peaceful protest only.

74. To study whether some hospitals spend too much on desperately ill patients, I look at a sample of cases in which people died and see how much was spent on their care in the last two years of their lives. I find that the level of spending among excellent facilities varies by a factor of two. “We are comparing patients with identical outcomes—all were dead in two years–so it’s unlikely that differences in severity of illness account for the [spending] variations we saw.” (Robert Pear, “Researchers Find Huge Variations in End-of-Life Treatment,” New York Times, April 7, 2008.) From this why can I not infer that the spending level in the more expensive hospitals was excessive?

75.

“Almost half of those arrested for plotting or carrying out attacks against the U.S. had prior criminal records, mostly for small-time offenses, a study for New York State investigators found. Such interactions with local law enforcement represented possible opportunities to ‘detect and deter an attack,’ the study said” (Sean Gardiner, “Early Chances Often Missed In Terror Cases,” Wall Street Journal, January 3, 2011).

What are the problems here? (The rest of the article does not point them out, showing yet again the embarrassment of journalism.)

76. Your doctor tells you “Take this medicine and it will cut in half the chance that you will get a particular kind of cancer even though it has somewhat unpleasant, although not dangerous, side-effects.” His statistic is correct, but it is not the one you want. What is?

77. During the summer of 2012, many analysts and American officials said things like: “In response to continued Iranian provocations, we’re instituting new sanctions. As they take hold and the pain inflicted on Iran increases, Western bargaining leverage will increase.” Assuming that the sanctions indeed are causing pain, that the population blames the government, and that the government cares, why does the conclusion not follow? (Note that that conclusion is not that Iran will give in, but just that Western leverage will increase as the sanctions take hold.)

78. To help people live through avalanches, I interview survivors about the techniques they used (e.g., staying clam, moving slowly, being guided by any light they see). I then print (and sell) a pamphlet detailing these methods to increase the chance that anyone caught can survive. Perhaps I shouldn’t. (There is both a fallacy and a problem here.)

Good stuff. Lots more interesting than the usual medical examples.

42 thoughts on “Statistical fallacies as they arise in political science (from Bob Jervis)”

Sameera Daniels on March 3, 2021 10:47 AM at 10:47 am said:

I had an inkling you knew Bob Jervis. I’ve read most of his books. In fact Perceptions and Misperceptions in International Relations drew me to taking some of the foreign policy positions that I did.

I’ll review these insights. But I note that there are so many other narratives that are excluded b/c they may not serve the purposes of goals and objectives.

RE: #77- In reading Seyed Hossein Mousavian’s book on the Iran nuclear negotiations and several subsequent articles, I did not get the impression that Western leverage increased. According to Mousavian, Iranians blamed the US for their economic condition. I’m sure that some subset of the Iranian public may blame its leadership for some of Iran’s economic conditions. There are educated Iranians that would like to have secular leadership. But a good percent of them emigrated to the West since the 70s.

I grew up among some prominent Iranologists: Richard Frye and Arthur Upham Pope to name two. In fact Princeton was a hub to discuss Iran, with Alllen Dulles presiding at conference on Iran. I was about 8 or 9 then.

Reply ↓
- Sameera Daniels on March 3, 2021 10:52 AM at 10:52 am said:
  
  In hindsight, I should have done my PHD under Robert Jervis. But I’m too eclectic of a thinker to undertake a PHD. Ironically, what I think gave me an edge was following the evidence based medicine movement in the 90s, a point that one of the defense officials recognized.
  
  Reply ↓
- Joshua on March 3, 2021 12:06 PM at 12:06 pm said:
  
  Sameera –
  
  But it’s an interesting exercise to evaluate where’s the fallacy in thinking that American bargaining leverage increases if the government is blamed (and the government cares that the public thinks the government is to blame)?
  
  While I doubt the government will get blamed (and I question how much it cares) I’m having trouble seeing the fallacy as the thought experiment is constructed.
  
  Maybe it’s that it’s wrong to assume the public would want to knuckle under even if they think the government’s to blame for their misery?
  
  Reply ↓
  - somebody on March 3, 2021 12:13 PM at 12:13 pm said:
    
    My guess is that the bargaining power is the size of the threat you can issue. If you’ve already discharged the threat and the government is still standing, your bargaining power has actually gone down. The carrot on the other side of possible sanction alleviation isn’t a symmetric reward, since the opposing state doesn’t care about people’s prosperity, just that there isn’t a popular revolution and they stay in power.
    
    I could be completely wrong though, don’t really know what I’m talking about here
    
    Reply ↓
    - somebody on March 3, 2021 12:15 PM at 12:15 pm said:
      
      Analogy: I take a hostage. I say, “give me your wallet or this guy gets it.” You don’t give me your wallet immediately, so I break his neck after just a second. Well, now I have nothing.
    - Joshua on March 3, 2021 12:35 PM at 12:35 pm said:
      
      somebody –
      
      Makes sense, and I think that’s the kind answer he’s going for as it fits better with a “fallacy” framing than the answer I offered. It’s more of a general principle than the one I suggested with is more context specific.
      
      What’s the prize if you got the right answer? Is Andrew giving away candy or t-shirts?
  - Sameera Daniels on March 3, 2021 12:45 PM at 12:45 pm said:
    
    I believe Andrew has compiled a list of statistical fallacies. If Andrew can re-post that would be great. Otherwise, I will have to re-read Jervis’ Perception and Misperception in International Politics, which amply the contexts in which specific cognitive biases [not statistics fallacies] that guide international rhetorical posturing.
    
    In short, I think we have too little information given for each scenario, imo. But I would label #77 as potentially an example of ‘false cause’.
    
    My experience of foreign policy development suggests that we rely on too few perspectives. And certainly, that can be gleaned by reading Sam Huntington’s Clash of Civilizations quite well. I was able to construct who had influence on Huntington b/c, surprisingly, I had heard the same and similar themes contained in his book, as a teen and in my twenties. I often heard behind the scenes ME and Iran expertise at the Harvard Club lunches b/c I tagged along with my Dad. So you hear what makes experts think the way that they do. That is to say casual conversations reveal more clearly how experts develop their expert niches.
    
    Reply ↓
  - jim on March 3, 2021 3:42 PM at 3:42 pm said:
    
    Perhaps the fallacy is that no one has the foggiest clue how to measure “bargaining power” in a negotiation such as the situation in Iran, so any claim to be increasing (or losing) it is vacuous.
    
    It would be interesting to try coming up with some kind of quantitative measure of it though. How do you do it? IMO you have to break down the government into individuals, assign each one a “bargaining” factor based on their relative hawkish or dovishness, then assign them some relative power factor in the hierarchy. Then each move on the strategic chess board becomes a multiplier based on the pressure that it creates on each individual.
    
    Then run your model and publish a paper claiming to have “shown” quantitatively how each individual impacts negotiations and how a wide range of potential moves would impact “bargaining power”! Voila! Science is so easy.
    
    Reply ↓
  - Dzhaughn on March 4, 2021 12:42 AM at 12:42 am said:
    
    Sanctions will simultaneously inflict pain on the countries that are applying the sanctions. A question of who gains leverage has to account for this.
    
    Reply ↓
rm bloom on March 3, 2021 3:49 PM at 3:49 pm said:

7. In explaining the origins of World War II, A.J.P. Taylor is correct in making almost no reference to Hitler’s extermination of the Jews. He is not concerned with making moral judgments about Hitler nor is he arguing whether or not the Allies should have made war on Hitler for the sake of those in Germany and the occupied territories. All he is trying to do is explain how and why the war started, the degree to which Hitler was unalterably aggressive, and the extent to which other countries share responsibility for the war. For questions such as these Hitler’s racial policies are irrelevant. Without endorsing Taylor’s answers to the questions he has set himself, or claiming that those questions are the most important ones, we can see that he was right to avoid being drawn into a discussion of Hitler’s domestic policies.

Fallacy. The war was prosecuted for a variety of domestic reasons. It is impossible to understand properly the momentum and direction of the Hitler phenomenon, and its drive toward war, without understanding its extraordinary locus-situs: the absolute obsession with the so-called “Jewish Problem”. The second war was many things; but without the basic premiss — that the first war was not lost, but that the Germans were *betrayed* …. by “international jewry” … the event and the urgency of the second war is incomprehensible. It takes a great deal of special pleading — and Taylor made a career out of doing so — to exclude this gross factor (that of revenge and revanche) from consideration.

Reply ↓
- Dzhaughn on March 4, 2021 2:09 AM at 2:09 am said:
  
  I do not believe this is the fallacy he intends to illustrate, although I’m not exactly sure what is. One reason that I think that is that one would have to read Taylor to evaluates your claims, which is out of bounds for this game. I would be beating up a strawman.
  
  Reply ↓
Andrew Halim on March 3, 2021 4:50 PM at 4:50 pm said:

Can someone help me and explain to me what fallacy(ies) involved in this statement (I have 0 background knowledge in politics/international relations):

39. To determine the causes of wars, I look at a random sample of wars, examining in detail the
domestic, bureaucratic, and international factors that seem to be involved and from these results
build a general theory about the relative importance of these influences.

Reply ↓
- Andrew on March 3, 2021 5:08 PM at 5:08 pm said:
  
  Andrew H.:
  
  The problem is selection bias. To study the causes of war, you need to include cases where war could arise but did not, along with cases where war actually arose.
  
  Reply ↓
Mathijs Janssen on March 3, 2021 5:35 PM at 5:35 pm said:

I feel like I need an answer key as well to appreciate this fully. Is there typically supposed to be one well known fallacy per statement? I get stuck on quite a few, e.g.
1. Seems wrong for many reasons (you’d need to compare to other options and you would need to measure risk, not just probability of success). 1a. seems really different from 1, but the labeling suggests a connection.
2. (cause vs correlation) and
3 (sampling bias) seem obvious, but in
4. I cannot tell what the mistake is. It seems like a correct conclusion to me.
5. Is also not clear to me. The evidence does not seem strong enough for the conclusion, but it does point in that direction. The comparison is not ceteris paribus and no correlation is not proof of no connection, but it seems like bad news for both theories.

I kind of gave up after this. I feel like I’m not well versed enough in the fallacies to get much from this without being able to check the answers.

Reply ↓
- Andrew on March 3, 2021 5:48 PM at 5:48 pm said:
  
  Mathijs:
  
  I think they’re meant to be discussion questions. The challenge, as always, is to teach students how to be critical without being nihilistic.
  
  Reply ↓
  - Mathijs Janssen on March 3, 2021 6:16 PM at 6:16 pm said:
    
    I see, fair enough. The framing by Jarvis in the introduction does seem to suggest that he has fairly specific answers in mind.
    
    Reply ↓
- dl on March 3, 2021 5:55 PM at 5:55 pm said:
  
  #4: Democrats could be 1/2 very conservative and 1/2 very liberal. (Was easier to get this before partisan sorting.)
  #5. Ecological fallacy + possible heterogeneous effects.
  
  (I’m not Jervis or anything, just my guess.)
  
  Reply ↓
  - Mathijs Janssen on March 3, 2021 6:15 PM at 6:15 pm said:
    
    Good point about 4! I am convinced. I don’t see it with 5, though. I don’t see how the ecological fallacy applies (that’s about groups versus individual features, right? What are the groups here?). The heterogeneous effects are a reason that the result does not follow, but it would be a bit of a coincidence if, say, all countries with higher instability have higher aggression ceteris paribus, but also have unrelated characteristics that make them less prone to aggression.
    
    Reply ↓
    - dl on March 3, 2021 6:22 PM at 6:22 pm said:
      
      the data appears to be at the “world-year” level, whereas you need data at the “country-year” level to get at the question, no?
    - Mathijs Janssen on March 3, 2021 6:26 PM at 6:26 pm said:
      
      Ah, I think you’re right; I misread the statement on that front. I read “in all countries” as “in each country”. I’m not sure if my reading is incorrect in terms of the English (it’s not my mother tongue), but now that I see your interpretation, I am sure that that is the intended one.
    - Michael on March 3, 2021 6:29 PM at 6:29 pm said:
      
      For 5 I think it’s possibly Simpson’s paradox (https://en.wikipedia.org/wiki/Simpson%27s_paradox) where there can be a positive (or negative) association in subgroups that disappears when you compute the association for the whole group, due to different intercepts. And I think Simpson’s is a specific form of an ecological fallacy.
      
      I think the other major fallacy is correlation != causation?
- Ken Schulz on March 4, 2021 10:53 PM at 10:53 pm said:
  
  #4 Democratic leaders are slightly more liberal than Democratic voters. If (and only if) Democratic voters are moderate, can one conclude that Democratic leaders are moderate. If Democratic voters are quite liberal, Democratic leaders are even more liberal.
  
  Reply ↓
dl on March 3, 2021 6:02 PM at 6:02 pm said:

26 is an unpleasant one though. 40, the supposedly related one, is easy, but 26 seems like some kind of annoying Bayesian trickery lol. I mean the conclusion doesn’t *necessarily* follow…we can concoct a world where only people with good cars leave them unlocked, and people with shit cars lock them…but I don’t think that’s his point here.

Reply ↓
- Michael on March 3, 2021 6:20 PM at 6:20 pm said:
  
  They’re both base rate fallacies. The point in 26 is not (I think) good cars or shit cars. It’s just that if there was no causal relationship between locking your car and getting it stolen and the base rate of the exposure (not locking cars) is high, then you can still observe that 75 percent number in the presence of no causality.
  
  Reply ↓
  - dl on March 3, 2021 6:44 PM at 6:44 pm said:
    
    yeah good point. the world in which > 75% of people don’t lock their cars is also fanciful, but logically valid.
    
    Reply ↓
  - Sameera Daniels on March 3, 2021 9:49 PM at 9:49 pm said:
    
    Yes, base rate fallacies. Base rate neglect is prevalent in much of our thinking.
    
    Reply ↓
- Mathijs Janssen on March 3, 2021 6:23 PM at 6:23 pm said:
  
  This is getting embarrassing, but I am even confused by 40. If the statement is literally that the probability of an accident is smaller in a 10 mile ride than in a 5 mile ride, then the conclusion seems correct. If the statement is supposed to be that the probability of an accident per mile is higher for shorter rides, then the conclusion is incorrect, but the assumption is poorly phrased. Or am I missing something again?
  
  Reply ↓
  - dl on March 3, 2021 6:38 PM at 6:38 pm said:
    
    the statement doesn’t imply anything about the per-mile accident rate of short vs long trips. for example if more short trips are taken than long trips, the per-mile rate could be equal between short and long trips. or the rate could be greater for long trips.
    
    (that’s leaving aside the ugly counter-factual of “how do accident rates change for someone who starts going on long trips instead of short ones, even when he doesn’t need to go anywhere far, based on a mistaken belief about accident rates?” lol.)
    
    Reply ↓
    - dl on March 3, 2021 6:41 PM at 6:41 pm said:
      
      plus i guess if you’re just driving around to avoid going on < 5 mi trips, it's not even the per-mile rate that's relevant.
    - Mathijs Janssen on March 3, 2021 6:42 PM at 6:42 pm said:
      
      That _was_ embarrassing! Yes, you’re right!
Jukka on March 3, 2021 9:06 PM at 9:06 pm said:

A brilliant collection! I like the idea also in terms of pedagogy.

Actually, something like this might be a good idea for a textbook. Not just questions, but a whole Q&A. For statistics education, one might present a bad analysis and then a better one? But as the examples demonstrate, it wouldn’t have to be about statistics. One could operate with a similar framework for qualitative research, history, or whatever.

Reply ↓
rm bloom on March 4, 2021 2:24 AM at 2:24 am said:

For an encyclopaedic enumeration of this sort of provocative material, see “Historian’s Fallacies” by D.H. Fischer.
Fischer’s critiques, in my opinion, seem often to fall prey to a certain “positivistic” fallacy, but the examples he picks out are fascinating on their merits (or shortcomings) alone.

Reply ↓
John on March 4, 2021 5:21 AM at 5:21 am said:

To help people live through avalanches, I interview survivors about the techniques they used (e.g., staying clam, moving slowly, being guided by any light they see). I then print (and sell) a pamphlet detailing these methods to increase the chance that anyone caught can survive. Perhaps I shouldn’t. (There is both a fallacy and a problem here.)

What is the problem here, besides that we don’t know what the non-survivors did?

Reply ↓
- Joshua on March 4, 2021 8:03 AM at 8:03 am said:
  
  Maybe there’s a problem in that only people who got caught in an avalanche to begin with are being interviewed (kind of doubling down on the survivor bias aspect)?
  
  Reply ↓
- Michel Ney on March 4, 2021 8:40 AM at 8:40 am said:
  
  I guess that this is exactely the problem. If non-survivors were staying calm, were moving slowly, were guided by any light they saw, at the same proportions as survivors, there would be no association between such behaviors and survival. Plus, if you teach people to engage in such behaviours, and you do not have evidence about their helpfulness, it is possible that you prevent people from doing more helpful things.
  
  Reply ↓
- Jonathan (another one) on March 4, 2021 10:06 AM at 10:06 am said:
  
  The misdirection in this fallacy is that the things listed seem reasonable. What if every survivor thought at one point “I’m gonna die.” Surely you wouldn’t think that thinking “I’m gonna die” is an essential component of survival.
  
  Reply ↓
- Dzhaughn on March 4, 2021 6:06 PM at 6:06 pm said:
  
  It maybe that the advice kills more people than it helps. Alas, we only interview the ones it worked for.
  
  Similar to a classic problem of where to reincorce the armor on planes.
  
  https://hbr.org/2009/03/beware-the-danger-of-selection.html
  
  Reply ↓
dl on March 4, 2021 5:22 PM at 5:22 pm said:

what’s the answer to 47?

Reply ↓
- Kevin on March 4, 2021 5:26 PM at 5:26 pm said:
  
  Maybe they want to select for new customers who already exercise regularly, and therefore are healthier than average.
  
  Reply ↓
- Mathijs Janssen on March 4, 2021 5:37 PM at 5:37 pm said:
  
  Maybe that it signals health? As in, only healthier people would take the offer. Which is useful information for an HMO?
  
  Reply ↓
- Dzhaughn on March 4, 2021 5:57 PM at 5:57 pm said:
  
  They are selecting for a healthier pool of insured. For example, those who are quite ill do not value the gym membership bonus. Those who are avid exercisers will value it.
  
  Reply ↓
- dl on March 4, 2021 10:05 PM at 10:05 pm said:
  
  good thoughts everyone thanks
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Statistical fallacies as they arise in political science (from Bob Jervis)

42 thoughts on “Statistical fallacies as they arise in political science (from Bob Jervis)”

Leave a Reply Cancel reply