sorry for the delayed response, been away.

yes, I am arguing the reverse, more or less, assuming prediction markets operate as promised.

My previous arguments contained:

1) Pr(good qualities for general| nominated)> Pr(good qualities for general), which is Simpson’s basically.

2) Pr(good qualities for general, not good for nomination) can be incorporated into Pr(win|nominated) when *comparing* candidates, and it is more or less in practice. The point is a relative one. We should expect this in a prediction market that functions as promised, which is the domain of this discussion.

In sum, my verbal arguments accounted for the causal graph feature that highlights variables determining nominee & president. I made this very distinction when discussing a candidate being charismatic (good for nominee & election) and being moderate (good for election). The qualities that are good for both get updated for all candidates (perhaps unequally, depending on the base—e.g. Yang more). Again, it is a relative point, and .2 vs .8 will mean something, as I mentioned before. Calibration is another matter, and it will depend greatly on assumptions.

]]>Thanks. Given all of the misunderstandings in the earlier thread, I figured a program was less subject to misinterpretation.

Regarding public v. non-public information, I do think it’s important because it complicates the simplest version of the correlation != causation story, namely one in which the only confounder is public information. In that case, the response is that that market is conditioning on it, so should handle this confounder just fine. The real problem is unobserved confounding, which means nonpublic information. This could be either time-invariant factors that have not yet been disclosed, or they could be future values of time-varying factors (which are thus presently unknown).

I did not draw arrows between the public and nonpublic variables because I’m not interested in modeling interrelations among them. They exist in the graph as a partitioning exercise. E.g., the nonpublic information that affects both nominations and elections does not include consequences of public information that affects both nominations and elections. Rather, it only includes further exogenous information not reflected in the public information variable. So while there are unobserved variables on the path from z* to x and z* to y, conceptually these are not included in z. This allows me to draw the graph the way I did.

Agree completely on the literature. I’m sort of surprised we’re having this discussion at such a primitive level—it ought to be easy to cite papers on this, but I’m not aware of any. Would appreciate pointers to relevant references if anyone finds any.

]]>Ah, I get it now. Thanks!! Great to see this programmed out in R.

I hadn’t considered separating out public and private variables. That’s an interesting idea and, as compared with my examples, it engages more with the prediction market source of the p(win | nominated) values. One point it made me wonder about is whether there should be arrows between the public and private variables.

It’s been a pleasure to have these discussions about electability, especially since I have difficulty finding good papers on it in the social choice and political science literatures. On the one hand, many peoples’ main political objective is to beat the other party’s presidential candidate in the general election. Yet, on the other hand, there seems to be relatively little theoretical or evidence-based research on how you should cast your primary vote, strategically speaking, if your main goal is to beat the opposing party’s candidate. Or maybe I’ve just missed the articles in my searches.

]]>To turn my observational model DAG into your Figure 1, just collapse z*, z, u*, u, v* and v into one variable. This is what you call “background”. It’s the same model, I’m just drawing some further distinctions between components of “background” which aren’t super important from a conceptual POV, but which are useful for motivating the simulation script.

]]>Ha, that’s funny. The difference between my notation and Pearl’s is mostly due to my erroneous recollection. What I mean by the observational model graph is this:

(1) Nomination (“x”) is a parent of election (“y”);

(2) Some stuff (“z”) is a parent of both nomination and election;

(3) Some stuff (“u”) is a parent only of nomination;

(4) Some stuff (“v”) is a parent only of election;

(5) Some stuff (“*”) is public information, and so is fully conditioned on by efficient markets.

Items (1)-(5) are equivalently reflected in the structural equations for x and y. To get the interventional model graph, we delete the arrows going into x, and replace the random variable x with the constant x = 1. In the equations, this means replacing the structural equation for x with the constant 1, but otherwise leaving the rest of the equations unmodified. Does that help?

I think the only substantive difference between your conception and mine is that I’m distinguishing between public and private information, since efficient markets fully condition on the former but not the latter. Otherwise I think we’re telling the same story conceptually, which is why we arrive at the same conceptual conclusion. But I’d be grateful to be retaught how to write these DAGs and equations to conform to more standard notation!

]]>Fascinating. I have no idea how to read these graphs! Even though we have been arriving at similar conclusions, it seems we have been following very different approaches.

Is there a guide to understanding this graph notation that you can recommend?

The causal graph examples I gave are the standard DAG notation, for example found in Pearl’s

“Causality” and Pearl, Glymour, and Jewell’s “Causal Inference in Statistics: A Primer.”

Oops, I didn’t mean for my comment to be so nested. I’ll try again…

Thanks Joshua! In reply

Perhaps it is on me, but I fail to see what the causal graphs could add here.

The causal graphs show that, even in the most basic, stripped-down version of the problem, having p(win | nominated) be a good approximation of electability requires that the variables that determine the nominee are largely independent of the variables that determine the president. The requirement is implausible, so we shouldn’t expect p(win | nominated) to be a good approximation of electability.

To my understanding, this conclusion addresses the heart of the discussions going on above.

The causal graphs also show that observing “p(win = A | nominated = A) > p(win = B | nominated = B)” and concluding “candidate A has greater electability” is the same mistake as observing “p(healed | got drug A) > p(healed | got drug B)” and concluding “drug A has greater heal-ability.” Same graphs.

I mean we have already noted the Simpson’s like structure

I couldn’t find previous mentions of Simpson’s paradox, though maybe it was implied.

In any event, we can’t test whose model is more right.

I don’t intend the causal graphs as models really — they’re identifibility arguments. I’m using them to claim that p(win | nominated) identifies a fundamentally different quantity than electability. I think this point was made by Ram earlier too. To me, it means that our default position should be that P(win | nominated) is a poor measure of electability, and that substantial evidence should be needed to shake us from this default position. You seem to have been arguing the reverse.

]]>I agree that the head-to-head polls have problems like you mention. Thanks for raising that point. However, I’m saying that the head-to-head polls are targeting the right quantity (electability), while estimates of p(win | nominated) are targeting something else.

Maybe thinking about treatments and patient outcomes could clarify:

Suppose an unadjusted observational study found p(healed | got drug A) > p(healed | got drug B), while a fully-adjusted observational study found p(healed | do(got drug B)) > p(healed | do(got drug A)). In practice, it could be that the unadjusted study was a lot better quality than the adjusted study, but we shouldn’t let this mislead us into thinking that in general “p(healed | got the drug)” is a better measure of ability to heal than “p(healed | do(got the drug)).”

Analogously, even if it were the case that the prediction markets have a lot better quality than the head-to-head polls, we still shouldn’t be mislead into thinking that p(win | nominated) is the better measure of electability.

]]>Thanks Joshua! In reply

]]>Perhaps it is on me, but I fail to see what the causal graphs could add here.

The causal graphs show that, even in the most basic, stripped-down version of the problem, having p(win | nominated) be a good approximation of electability requires that the variables that determine the nominee are largely independent of the variables that determine the president. The requirement is implausible, so we shouldn’t expect p(win | nominated) to be a good approximation of electability.

To my understanding, this conclusion addresses the heart of the discussions going on above.

The causal graphs also show that observing “p(win = A | nominated = A) > p(win = B | nominated = B)” and concluding “candidate A has greater electability” is the same mistake as observing “p(healed | got drug A) > p(healed | got drug B)” and concluding “drug A has greater heal-ability.” Same graphs.

I mean we have already noted the Simpson’s like structure

I couldn’t find previous mentions of Simpson’s paradox, though maybe it was implied.

In any event, we can’t test whose model is more right.

I don’t intend the causal graphs as models really — they’re identifibility arguments. I’m using them to claim that p(win | nominated) identifies a fundamentally different quantity than electability. I think this point was made by Ram earlier too. To me, it means that our default position should be that P(win | nominated) is a poor measure of electability, and that substantial evidence should be needed to shake us from this default position. You seem to have been arguing the reverse.

https://www.dropbox.com/s/pudrjf94thz36fm/Electability.pdf?dl=0

This defines notation, lays out two models as both DAGs and structural equations, provides (my understanding of) Andrew’s and my definitions of electability in terms of these models, and includes a (typo-free!) R script for illustrating differences between our definitions, with a default specification designed by analogy with Carlos’s war with Canada example.

Would appreciate feedback from anyone (still!) interested in this discussion. The script ought to fully disambiguate my prior comments, but let me know if I can clarify anything (or if the link doesn’t work).

]]>Dear More Anonymous-

Thanks for taking the time to reply here. Perhaps it is on me, but I fail to see what the causal graphs could add here. I mean we have already noted the Simpson’s like structure, with nomination signaling info that increases general election odds. The point I have made in other comments is that all candidates share this. The unique characteristics that aren’t honed or brought out in the primaries, the ones that set the candidate apart, are what P(win|nominated) can capture, depending on how we model the election. In any event, we can’t test whose model is more right. Also: don’t worry, I wouldn’t use P(win|nominated) if I needed to make decisions, albeit for other reasons. :)

]]>Another typo… The intended coefficients for the war with Canada scenario are:

a0 = -2

a1 = 2

a2 = 1

a3 = 1

a4 = 1

b0 = -2

b1 = 2

b2 = 1

b3 = 1

b4 = 1

This implies that the average candidate in this population is very unlikely to be nominated, and even if nominated very unlikely to be elected (hence low electability). However, if there is a war with Canada (also very unlikely), the candidate is both more likely to be nominated, and more likely to be elected. In this scenario, (my understanding of) Andrew’s proposal judges the average candidate to be somewhat electable (~60%), while my proposal judges them to be rather unelectable (~25%).

]]>Typo…

# Estimate Ram’s proposal (using nonparametric regression)

n1 = xgb.DMatrix(data = design[, c(1, 3)], label = y0)

should instead say

# Estimate Ram’s proposal (using nonparametric regression)

n1 = xgb.DMatrix(data = design[, c(1, 3)], label = y1)

Attempt II, part II of II:

### Analysis ###

# Construct design matrix for electability calculations

design = cbind(z.star, u.star, v.star)

# Estimate Andrew’s proposal (using nonparametric regression)

index = which(x0 == 1)

n0 = xgb.DMatrix(data = design[index, ], label = y0[index])

gbm.cv = xgb.cv(data = n0, nfold = 10, nrounds = 500, objective = “binary:logistic”, verbose = FALSE)

lambda = which.min(gbm.cv[[“evaluation_log”]][, “test_error_mean”][[“test_error_mean”]])

m0 = xgboost(data = n0, nrounds = lambda, objective = “binary:logistic”, verbose = FALSE)

n0 = xgb.DMatrix(data = design, label = y0)

p0 = predict(m0, n0)

# Estimate Ram’s proposal (using nonparametric regression)

n1 = xgb.DMatrix(data = design[, c(1, 3)], label = y0)

gbm.cv = xgb.cv(data = n1, nfold = 10, nrounds = 500, objective = “binary:logistic”, verbose = FALSE)

lambda = which.min(gbm.cv[[“evaluation_log”]][, “test_error_mean”][[“test_error_mean”]])

m1 = xgboost(data = n1, nrounds = lambda, objective = “binary:logistic”, verbose = FALSE)

p1 = predict(m1, n1)

### Results ###

# Summarize Andrew’s electability scores

summary(p0)

sd(p0)

# Summarize Ram’s electability scores

summary(p1)

sd(p1)

# Summarize paired differences

summary(p0 – p1)

sd(p0 – p1)

# Proportion of simulations where Andrew’s score exceeds Ram’s

mean(p0 > p1)

Attempt II, part I of II:

# install.packages(“xgboost”)

library(xgboost)

### Specification ###

# Number of simulations

N = 100000

# Means of exogenous random variables

mu.z = 0

mu.z.star = 0

mu.u = 0

mu.u.star = 0

mu.v = 0

mu.v.star = 0

# SDs of exogenous random variables

sigma.z = 1

sigma.z.star = 1

sigma.u = 1

sigma.u.star = 1

sigma.v = 1

sigma.v.star = 1

# Coefficients for determining nomination status

a0 = -2

a1 = 2

a2 = 1

a3 = 1

a4 = 1

# Coefficients for determining election status

b0 = 2

b1 = 2

b2 = 1

b3 = 1

b4 = 1

### Simulation ###

# Generate exogenous random variables

z = rnorm(N, mu.z, sigma.z)

z.star = rnorm(N, mu.z.star, sigma.z.star)

u = rnorm(N, mu.u, sigma.u)

u.star = rnorm(N, mu.u.star, sigma.u.star)

v = rnorm(N, mu.v, sigma.v)

v.star = rnorm(N, mu.v.star, sigma.v.star)

# Generate observational data

x0 = 1 * (a0 + (a1 * z) + (a2 * z.star) + (a3 * u) + (a4 * u.star) > 0)

y0 = 1 * (b0 + (b1 * z) + (b2 * z.star) + (b3 * v) + (b4 * v.star) > 0) * x0

# Generate interventional data

x1 = rep(1, N)

y1 = 1 * (b0 + (b1 * z) + (b2 * z.star) + (b3 * v) + (b4 * v.star) > 0) * x1

This is turned into a discussion of electability, which is getting away a bit, but I’ll bite. What is the evidence that head-to-head polls like these are predictive? For sure it won’t be if candidates differ greatly on name recognition.

Regarding whether P(win | nominate) captures something meaningful, or is a good approximation, under our assumptions that betting markets work as promised, and elections operate as they do: none of our claim are testable, the assumptions don’t hold.

]]>Carlos, thanks for the summary of people’s positions to date. Very helpful. I’ve tried to take your summary into account in my responses to Andrew and Joshua above.

In short, I think P(win | nominate) is rarely a good approximation for electability, and that we should instead look to poll results that force potential voters into a binary choice between Trump and Biden, or between Trump and Warren, or between Trump and Sanders, etc. These polls target P(win | do(nominate)), and not P(win | nominate).

]]>Joshua —

tl;dr: Thanks very much for your response. I think that p(win | nominated) will commonly be a poor measure of electability. In particular, I express a simple version of the problem in causal graphs, which show that unrealistic independencies are required for p(win | nominated) to be a good measure of electability.

I agree with your point that we may want to care about p(win | nominated), which I take to be a fairly uncontroversial claim. However, I also think that p(win | nominated) will commonly be misleading when used as a measure of electability.

Below, I show this by expressing a simple version of the problem in causal graphs. As seen in the graphs, p(win | nominated) will only be a good measure of electability to the extent that the characteristics that determine who wins the nomination are independent of the characteristics that determine who wins the presidential election. Since these characteristics are generally highly related and overlapping, I expect that p(win | nominated) generally provides a poor reflection of electability.

For candidates A and B, one can also ask, “How often will p(win = A | nominated = A) > p(win = B | nominated = B), but B be more electable than A?” This occurs under the same conditions as the classic Simpson’s paradox, as can be seen in both the conditional probability tree example I gave before and the causal graphs below. In practice, I *think* that the conditions for Simpson’s reversal will be relatively common in the US primary system, since potential nominees who are more favored in the primary are often less favored for the presidental election (e.g., candidates far left or right of centre). However, I am far from sure of that.

Here are the causal graphs:

https://i.postimg.cc/YSLQHFcx/Electability-DAGs.jpg

In Figure 1, the nomination result is determined by a set of background variables, while the presidential election result is then determined by both the background variables and the nomination result.* We would like to identify the candidate who, if supported to become the nominee, is most likely to defeat Trump. This candidate is the most electable.

For simplicity, let’s start by considering interventions, do(Nomination = C), which establish candidate C as the nominee with certainty. Then, the most electable candidate has the highest P(Presidency = C | do(Nomination = C)). But as seen from the graph, P(Presidency = C | do(Nomination = C)) only equals P(Presidency = C | Nomination = C) if the variables which determine the presidency are independent of the variables which determine the nomination, which is unrealistic.

As can also be seen from the graph, misinterpreting p(win | nominated) as electability is the same kind of error that people make when they describe an observed association between a treatment and a disease outcome as “the effect” of the treatment, without accounting for confounding — the same graph structure applies to both scenarios.

A limitation to the analysis is that it does not account for the possibility that the choice of nominee changes the values of variables that determine the presidency. This is addressed in Figure 2. However, even in this more general case, unrealistic independencies between determinants of the nomination and the presidency are still required for P(Presidency = C | do(Nomination = C)) to equal P(Presidency = C | Nomination = C).

Another limitation is that the intervention do(nominee = C) sets the nominee to candidate C with certainty, but most mechanisms of support for C that we are interested in will only increase the probability that the nominee is C, and the strength of support may itself depend on background variables. We can address this limitation by evaluating electability for stochastic and conditional interventions, but this does not remove the requirement for unrealistic independencies if you want P(Presidency = C | do(Nomination = C)) to be a good measure of electability.

There are many complicated features of the real world that have not been addressed in the analyses above — dynamic changes in probabilities, vote splitting in the primary, and so on. But my point is that, since P(win | nominated) appears to be a poor measure of electability in the analyses that distill the situation down to its basics, we have no good reason to expect P(win | nominated) will become a good measure of electability if we add all the complexities on top.

* For those unfamiliar with causal graphs: All nodes in the graphs are also determined by “disturbance” or “noise” variables that are independent with each other and with all other variables in the causal graph. By convention these are not shown, but are assumed present.

]]>OK… this didn’t render fully or correctly. Will try to figure out how to post it properly.

]]>Here is an R script. In the default specification, the probability of nomination is very low (a0 = -2), but the probability of election conditional on nomination is very high (b0 = 2). Moreover, there is unobserved confounding (a1 = b1 = 2). An example along these lines is Carlos’s war with Canada, where the candidate is very unlikely to be nominated unless we go to war with Canada, but if we do the candidate is very likely to be elected. Running the script with this specification results in (my understanding of) Andrew’s proposed electability score being very high, while my proposed score is very low.

# install.packages(“xgboost”)

library(xgboost)

### Specification ###

# Number of simulations

N <- 100000

# Means of exogenous random variables

mu.z <- 0

mu.z.star <- 0

mu.u <- 0

mu.u.star <- 0

mu.v <- 0

mu.v.star <- 0

# SDs of exogenous random variables

sigma.z <- 1

sigma.z.star <- 1

sigma.u <- 1

sigma.u.star <- 1

sigma.v <- 1

sigma.v.star <- 1

# Coefficients for determining nomination status

a0 <- -2

a1 <- 2

a2 <- 1

a3 <- 1

a4 <- 1

# Coefficients for determining election status

b0 <- 2

b1 <- 2

b2 <- 1

b3 <- 1

b4 <- 1

### Simulation ###

# Generate exogenous random variables

z <- rnorm(N, mu.z, sigma.z)

z.star <- rnorm(N, mu.z.star, sigma.z.star)

u <- rnorm(N, mu.u, sigma.u)

u.star <- rnorm(N, mu.u.star, sigma.u.star)

v <- rnorm(N, mu.v, sigma.v)

v.star <- rnorm(N, mu.v.star, sigma.v.star)

# Generate observational data

x0 0)

y0 0) * x0

# Generate interventional data

x1 <- rep(1, N)

y1 0) * x1

# Construct design matrix for electability calculations

design <- cbind(z.star, u.star, v.star)

### Analysis ###

# Estimate Andrew's proposal (using nonparametric regression)

index <- which(x0 == 1)

n0 <- xgb.DMatrix(data = design[index, ], label = y0[index])

gbm.cv <- xgb.cv(data = n0, nfold = 10, nrounds = 500, objective = "binary:logistic", verbose = FALSE)

lambda <- which.min(gbm.cv[["evaluation_log"]][, "test_error_mean"][["test_error_mean"]])

m0 <- xgboost(data = n0, nrounds = lambda, objective = "binary:logistic", verbose = FALSE)

n0 <- xgb.DMatrix(data = design, label = y0)

p0 <- predict(m0, n0)

# Estimate Ram's proposal (using nonparametric regression)

n1 <- xgb.DMatrix(data = design[, c(1, 3)], label = y0)

gbm.cv <- xgb.cv(data = n1, nfold = 10, nrounds = 500, objective = "binary:logistic", verbose = FALSE)

lambda <- which.min(gbm.cv[["evaluation_log"]][, "test_error_mean"][["test_error_mean"]])

m1 <- xgboost(data = n1, nrounds = lambda, objective = "binary:logistic", verbose = FALSE)

p1 p1)

Notation:

z: presently nonpublic information which influences whether candidate is nominated and whether candidate is elected

z*: presently public information which influences whether candidate is nominated and whether candidate is elected

u: presently nonpublic information which only influences whether candidate is nominated

u*: presently public information which only influences whether candidate is nominated

v: presently nonpublic information which only influences whether candidate is elected

v*: presently public information which only influences whether candidate is elected

x: 1 if candidate is nominated, 0 if not

y: 1 if candidate is elected, 0 if not

A, A*, B, B*, C, C*: unspecified marginal distributions

f, g: unspecified functions

Observational Model:

z ~ A

z* ~ A*

u ~ B

u* ~ B*

v ~ C

v* ~ C*

x := g(z, z*, u, u*)

y := f(x, z, z*, v, v*)

Interventional Model:

z ~ A

z* ~ A*

v ~ C

v* ~ C*

x := 1

y := f(x, z, z*, v, v*)

Andrew’s Proposal:

Electability := P(y = 1 | x = 1, z*, u*, v*) in the observational model

Ram’s Proposal:

Electability := P(y = 1 | z*, v*) in the interventional model

]]>Thanks for this Carlos. I also have been puzzled by the number of distinct conversations stemming from the original one. My focus has primarily been on the straight contradiction between your first quote from me and your quote from Andrew. And I agree with your concluding statement: I have no problem with the claim that P(Win | Nominated) carries some information about electability, and that under certain special conditions it may even give a good approximation of it. My original post was designed to remind people that these two concepts are distinct, and *not necessarily* good approximations of each other. I think (?) this has been broadly accepted, except by Andrew, who hasn’t retracted his repeated claim that I was making some sort of error (seemed to me the allegation was of an elementary probability confusion). I do think that his mention of Newcomb is interesting in this context, but the relationship between this case and that one is a discussion for another day.

Thanks all for your feedback and insights, and thanks to Andrew for hosting this very interesting thread.

]]>I’m on it!

Last three words cut.

]]>typo, plz cut last stray 3 words

]]>thanks for this Carlos. Multiple discussions was an issue, and I think that happened because the mathematical point was never acknowledged clearly. One caveat: Ram wasn’t just making the general mathematical point, he was also making the practical point, which is where disagreement arose. It wasn’t simply

]]>Niall:

Huh? Yes, Sam Wang was wrong on that assessment! The perverse incentive is that his overconfidence got him attention (see item 3 here). As I wrote: “After the election, Wang blamed the polls, which was wrong. . . . The mistake was not in the polls but in Wang’s naive interpretation of the polls which did not account for the possibility of systematic nonsampling errors shared by the mass of pollsters, even though evidence for such errors was in the historical record.”

]]>These markets are driven by sentiment and fall into every known trap – confirmaton bas, motivated reasoning, wishful thnking etc …

#

https://www.bettingmarket.com/fools.html

Two points. (Sorry about the long response.)

1 — Newcomb’s paradox:

I agree that Newcomb’s paradox is fascinating and can be related to prediction markets. But the issues with p(win | nominated) as “electability” arise even in situations where Newcomb’s paradox does not apply. For example, don’t view the probabilities in the example tree I gave as values from prediction markets. Instead, imagine they are my subjective individual beliefs of what would happen if I didn’t vote in the primary. Then if I voted for the candidate with highest p(win | nominated), I would still be advantaging Trump!

2 — Electability:

As you’re likely aware, people run polls in which potential voters are asked to make a binary choice between Trump vs. Biden, or between Trump vs. Warren, or Trump vs. Sanders, etc. The results of these polls could be used to assign to each candidate a 538-style probability of winning against Trump if they were nominated, which I think would make a fairly good assessment of electability.

However, the 1-vs-1 poll-based assessments measure a fundamentally distinct quantity from p(win | nominated). I’m not sure if you’ve observed this already or if it has escaped your notice.

For instance, in the example probability tree I gave, P(President = A | nominated = A) was 54 percent. However, using the exact same numbers found in that probability tree, a 1-vs-1 poll-based assessment of the probability of A winning vs. Trump would give 44 percent in expectation (=0.2 * 0.8 + 0.8 * 0.35), including 80 percent if we live in the world with less sexism and 35 percent if we live in the world with more. In no case does the quantity equal p(win | nominated)!

So even if there is no market inefficiency, even if polls are perfect, even if probabilities don’t change, and so on — even if the prediction markets and poll-based analyses in fact derive from the same numbers — even in this case p(win | nominated) is measuring a fundamentally different quantity from the 1-vs-1 comparisons that appear most naturally to be measures of electability.

Given this, I think p(win | nominated) should not be called electability. It’s clear from Mankiw’s posts and others reactions (including, as far as I can tell Joshua Miller’s and your own) that p(win | nominated) is being viewed as a guide for electability with the same kind of information as 1-vs-1 poll-based results. But this is incorrect. Even in the best of cases where everything is based on the same underlying probability tree, p(win | nominated) is measuring a different quantity than 1-vs-1 comparisons.

This raises a question: what do the 1-vs-1 poll-based comparisons measure, expressed in the language of probability? I *think* they measure P(President = A | do(nominated = A)), where the do() intervention is whatever each poll respondent vaguely expects would happen for A to be nominated. Actually, I think this would probably be a decent match, on average, to the kind of interventions that would in reality get the candidates nominated! So they are very relevant to electability.

]]>+1 to all this. We seem to be having multiple discussions.

]]>There is a primary today in which we can vote. One month from today, there is another primary, the convention is two months from today, and the election is three months from today. Assume (at least for now) that our vote is not large enough to affect anything.

At each point in time, other people’s votes are determined by a known stochastic model. There is a vector of state variables, z, that may affect anything. Note that calculations regarding “electability” are part of z since anyone can make those calculations. The candidate that gets the most votes in the two primaries wins the nomination. We do not make donations or do anything else that might affect anything … let’s keep this simple for now.

As a first step, assume our only objective is to identify which candidate we hope will be nominated because he/she is most likely to win the general election. This is the clearest definition of “electability” I can think of. If you disagree, state the objective in plain English without using the word “electability”.

Can we accomplish our objective using betting market information? (Assume the betting markets are perfect.)

Next step: add an action we can take now that will affect the system. Our objective is to increase as much as possible the probability that a Democrat will win the election. How should we act? This is not a simple problem! You can’t just act to help the most “electable” candidate because different candidates may be affected differently by our action. (The action may affect the betting markets … assume we know how it will affect the betting markets.)

]]>That’s also a good example. Another one that everyone agrees with! Terry gave the best summary of this discussion: “We don’t even agree on what we disagree on.”

Ram 23.11> the concept of electability is not captured by P(Win | Nominated) […] If my objective is instead *intervening* to hand the nomination to the most electable candidate, P(Win | Nominated) is not what I’m looking for.

Andrew 23.11> You are wrong. The concept of electability is ____exactly____ captured by P(win|nominated).

Joshua 28.11> Of course, we all agree — from the beginning, I presume — that we can construct simple models in which P(Win | Nominated) is a terrible measure of electability.

Ram 23.11> Even ignoring issues of market inefficiency, we cannot ____in general____ estimate electability using ratios of these prediction market prices.

Joshua 30.11> I gave examples of how you could estimate with those ratios, and how the ordering ____would likely____ be informative.

Joshua 30.11> The disagreement was about claims that we shouldn’t care about P(win|nominate) if we care about electability, and that P(win| nominate by intervention) is what we are after.

The disagreement seems to be that one side says “A!=B in general” and the other side responds “A~B sometimes”.

“We care about X, not about Y” and “we care about Y to the extent that it may be close to X” are not contradictory claims either.

]]>Hi Ram

I think you made a good point that had not been addressed here. I think the disagreement, to the extent there is one, is about the actual relevance of P(Win | Nominated).

]]>Sounds good.

Ps. You wrote:

> [Andrew] “electability” is P(win|nominated), nothing more, nothing less

Sure, and then the paragraph I quoted can be rewritten as follows:

« the electability of a candidate given that we *observe* if they win the nomination is different from the electability of a candidate wins given that we *intervene* so that they win the nomination. »

I agree. I also think the “electability” of a candidate depends on which party they run for.

My main point was not Ram’s premise, but the conclusion. I gave examples of how you could estimate with those ratios, and how the ordering would likely be informative.

Anyway, good time to quit. Would be easier to hash out f2f.

]]>More Anonymous:

you wrote: “If your goal is to nominate the candidate who will be most likely to beat Trump, then there is no mathematical guarantee that you should support the candidate with the highest p(win | nominated) at the time of the primary.”

I don’t think the mathematical guarantee was ever controversial, or that there was confusion about this.

The disagreement was about claims that we shouldn’t care about P(win|nominate) if we care about electability, and that P(win| nominate by intervention) is what we are after.

]]>More:

I agree that once you consider an intervention (even a small intervention such as the casting of a single vote), the probabilities will change. If the betting market has no vig and is efficient (an assumption that is clearly violated in this case, as discussed in my above post; indeed that was the main point of my post), it will reflect some consensus about the joint probability distribution right now, implicitly integrating over all possibilities of future interventions. Some aspects of the above comment thread remind me of discussions of Newcomb’s paradox.

]]>It’s surprising to me that any of this is controversial. Here’s a simple, natural, numeric example where supporting the candidate with higher p(win | nominated) increases the probability of Trump winning. Therefore, it is incorrect to assume the candidate with higher p(win | nominated) is more electable.

There is a ton of discussion and confusion in comments above, but the problem has a unambiguous answer: If your goal is to nominate the candidate who will be most likely to beat Trump, then there is no mathematical guarantee that you should support the candidate with the highest p(win | nominated) at the time of the primary. To see this, here is a simple numeric example in which supporting the candidate with the highest p(win | nominated) increases the probability of Trump winning the election.

The example involves two Democrat candidates, A and B, and is shown by this conditional probability tree:

https://i.postimg.cc/V6B6qYyM/Electability.jpg

The first tree branch shows two possibilities: sexism being relatively less severe than polls suggest (P = 0.2), which advantages female candidate A, and sexism being relatively more severe (P = 0.8), which advantages male candidate B. The rest of the tree shows conditional probabilities of nomination and election to the presidency.

Note that P(President = A | Nominated A) > P(President = B | Nominated = B), so that according to Andrew and Mankiw, we should view A as having greater electability.

However, if during the primary we add delta in support of candidate A being the nominee (0.6 -> 0.6 + delta; 0.2 -> 0.2 + delta), then the probability of a Trump win changes to P(President = Trump) = 0.488 + 0.06 * delta. Counterintuitively, the probability of Trump winning increases when we support A!

For full generality, we can also imagine that the strength of our support can differ if there is less sexism (0.6 -> 0.6 + delta1) vs. more sexism (0.2 -> 0.2 + delta2). Then P(President = Trump) = 0.488 + -0.06 * delta1 + 0.12 * delta2. So, to ensure our support for candidate A reduces the probability of Trump winning, we would need our support to be more than twice as strong in the world/branch with less severe sexism (delta 1 > 2 * delta2). It is not clear how that could be ensured, since we do not know which world we live in, and since the factor of 2 itself depends on the probability associated with each world, which may be unobserved by us.

In summary, even in simple examples, there is no guarantee that the candidate with highest p(win | nominated) is the candidate to support if your goal is to beat Trump. Therefore I think it is wrong and misleading to call p(win | nominated) electability.

Notes:

* In Carlos’s nice example, a war against Canada is used instead of sexism. But I wanted to avoid using a future event for the split in the tree, to remove irrelevant considerations about how the probabilities will change.

* My example assumes that changing the conditional probabilities of nomination by delta does not alter the other conditional probabilities in the tree. If you suppose delta is small perturbation (true for most voters!) then this is a natural assumption, and the issues with describing p(win | nominated) as electability still arise.

> this *intervene* thing didn’t exist

I agree that there is not one single way to “intervene so that they win the nomination”. I have not even used the do() dotation in any of my messages and it doesn’t appear in that paragraph either. I think interventions exist. We can “not intervene” (a.k.a. “observe”) or we could “intervene” in different ways to increse the nomination probability of our preferred candidate. The resulting probabilty will depend on what we do (respect to the baseline where we do nothing).

> [Andrew] “electability” is P(win|nominated), nothing more, nothing less

Sure, and then the paragraph I quoted can be rewritten as follows:

« the electability of a candidate given that we *observe* if they win the nomination is different from the electability of a candidate wins given that we *intervene* so that they win the nomination. »

Ram’s first two points say simply (given that definition of electability) become:

(1) In general, “electability if I don’t intervene” != “electability if I do intervene”

(2) When people are trying to figure out which candidates are more or less electable, what they’re usually after is some particular “electability if I do intervene”

I don’t think you need detailed models of things to say that in general “interventions” change the probabilities.

The previous probability was a weigthed average over our potential actions, when we intervene one of those become real and the others vanish.

What you may need a detailed model for is to justify that prediction markets prices give you good estimates of the “post-intervention” electabilities.

Of course I agree that I’m not saying anything that we have not discussed already. Even Daniel agrees with me so it’s a good time to quit :-)

]]>I’m with Carlos on pretty much everything he said.

]]>I’ll try to clarify.

I wrote that I don’t think anyone disagreed on what Carlos wrote here

The disagreement, if any, was about the following claim:

« the probability that a candidate wins given that we *observe* them winning the nomination is different from the probability that a candidate wins given that we *intervene* so that they win the nomination. »

My reading wasn’t that anyone was claiming that these were equal, just that this *intervene* thing didn’t exist.

Of course I can’t speak for Andrew here.

Andrew wrote:

Electability” is just a word. It means different things to different people. As a statistician, I think “electability” is P(win|nominated), nothing more, nothing less.

Ram Wrote:

What I’m wondering is why anyone should care about P(Win | Nominated). My claim is that some people (e.g., Mankiw) care about this quantity because they think it is useful for identifying candidates who will fare better in the general election. My point is that this is fallacious.

…later…

(1) In general, P(Win | Nominated) != P(Win | do(Nominated))

(2) When people are trying to figure out which candidates are more or less electable, what they’re usually after is some particular disambiguation of P(Win | do(Nominated))

(3) Efficient prediction markets for who wins the election, and who wins the nomination, can be used to estimate P(Win) and P(Nominated)

(4) P(Nominated | Win) = 1Ergo

(5) Even ignoring issues of market inefficiency, we cannot in general estimate electability using ratios of these prediction market prices.

This seemed to be the point of disagreement. Andrew disagreed with this. He seemed to acknowledge that he wasn’t trying to make his best case.

I happened to be closer to Andrew’s view here, and I chimed in. I made an argument for why someone should care about P(Win | Nominated), and why it is not fallacious. This directly contrasts with the conclusion in (5). I have also made a case against premises (1)-(2). In particular, that (1) is ill-posed and (2) is not true.

IMO the main point of disagreement has been discussed, it seems to boil down to what we want electability to mean, and how relevant we view those pathological examples to be for the actual election process.

Ps. With regard to Carlos’ example just above with joint distributions, I am not sure what is being addressed that hasn’t been addressed already. I also think it would be dangerous to comment because we likely do not have common assumptions on our model of the electoral process, the nature of the betting market, and the allowable feedback between the two.

]]>welcome to the confused crowd :-)

Ram,

I agree that your latest example where the candidate “moves” may be easier to understand than the case where the candidate is “fixed” but the environment is different in the “general” cases than it was in the “special” cases where it was expected to get the nomination. Most people has the intuition about a trade-off between being “extremist” enough to please those than vote in the primaries and being “centrist” enough to win the general election. Many people will also cynically deduce that the optimal strategy is to say something to get the nomination, something else to win the election and then do whatever you want.

Joshua,

I don’t think there has been much of a disagreement between you and the “anti-Andrew crowd”.

I’m not sure about Andrew, honestly. In his last comment he said that “we just care about Pr(X,Y), where these probabilities average over all uncertainties”.

Maybe I’m complicated things but I find P(X=winner, Y=nominated) to be complicated. It’s a marginal probability obtained integrating over all uncertainties the incredibly complicated joint distribution P(X, Y, Z).

Those uncertainties evolve as things happen, the joint distribution changes over time, the marginal distribution changes over time. While most of those uncertainties are outside of our control, our actions also determine to some extent that the distribution goes in one or other direction.

Some actions can have a big impact. Say I have one preferred candidate A and I have a video of the favorite for the nomination B in blackface or whatever. If I release the video I cause P(X,Y)=P(X|Y)P(Y) to change:

– One can expect P(Y=A nominated) to go up as P(Y=B nominated) goes down.

– It’s also reasonable to expect that in this case P(X=A wins) will increase. At least that was my reasoning when I decided to release the video!

– P(X=A wins|Y=A nominated) could go in either direction (and there is no reason why it shouldn’t change). I don’t really care, my objective was to increase P(X=A wins). One reason why it could go down: the bar for A’s nomination is lowered so it may be now a worse nominated candidate that if it had managed to beat B without my intervention. One reason why it could go up: if there had been a nasty fight with B it could have been damaging for both regardless of who finally won the nomination.

– P(“a candidate from my preferred party wins”) could also go in either direction. This may be a problem if my actual objective was to increase the party chances and the only reason I wanted to increase P(X=A wins) is because I noticed that P(X=A wins|Y=A nominated) was higher than P(X=B wins|Y=B nominated).

]]>Joshua said:

Carlos writes:

The disagreement, if any, was about the following claim:

« the probability that a candidate wins given that we *observe* them winning the nomination is different from the probability that a candidate wins given that we *intervene* so that they win the nomination. »

I think we all agree on most things. I don’t think anyone ever disagreed on this.

Well this is discouraging. We don’t even agree on what we disagree on. Hello rabbit-hole.

]]>Yes, I think the word electability means different things to different people, and to many here the market implied p(win|nominated) doesnt mesh.

]]>Carlos writes:

The disagreement, if any, was about the following claim:

« the probability that a candidate wins given that we *observe* them winning the nomination is different from the probability that a candidate wins given that we *intervene* so that they win the nomination. »

I think we all agree on most things. I don’t think anyone ever disagreed on this.

Ram writes:

What we mean by “electability” is how a candidate who has the characteristics she *actually has* would perform in the general election. What we do NOT mean by “electability” is how a candidate who has characteristics she *is predicted to have given that she was nominated* would perform in the general election.

I had a feeling everything would hinge on our definition of electability, and our model of the election process.

A moderate candidate that is trainable may not currently have characteristics A that serve one well both in the primary and in the general, characteristics that a more polished extreme left candidate currently has. These characteristics can be developed, especially if one has what it takes to get through a primary. Betting markets *can* price in these characteristics now, with bets on nomination, but markets may also price in potentially fixed characteristics B that speak to electability in the general but mean nothing in the primary (like being moderate), with bets on winning, or characteristics C that mean something for both (like charisma), with bets on both. The estimate of some of these characteristics B, and their relative importance in the general, may not change at all conditioning on being nominated. Sure the estimate of characteristics A may change, but that is fine, making it through the nomination develops the candidate on certain dimensions, and also mechanically gives an electability bump by being awarded the party label. Of course, there is the caveat that you note with your nice example of the moderate moving left. Some of these characteristics B may not stay fixed if the candidate needs to get through the primary. This is a path we have to average over when thinking about what P(Win| Nominated) means.

The question becomes, what do we want electability to mean? I’d want to ignore certain actual characteristics that I believe will change if they make it through the nomination process. This is what I was getting after in my .9 vs .1 example, which I presume you don’t have an issue with. The others, like having to move left, I think relate to general electability also because we have to be realistic about what a presidential candidacy would like if this person wants to represent the Democratic party. For example, Bloomberg’s electability as a Democrat and his electability as a Republican are different things. I don’t see why we would want to talk about Bloomberg’s electability as some general 1-D construct. I think it is a contingent thing, as I have mentioned in other comments.

In this sense, my view of electability also does not gel with Daniel’s here:

]]>Really what we want is p(elected | we hand the nomination to the candidate now), which incorporates the ways the candidate would change through time between now and election, but not the effect of the nomination process on those changes…

Carlos

Ps. I was responding to your message way above, where you wrote:

]]>> Obviously becoming more nominate-able at t+1, increases overall electability

Not necessarily. One can easily imagine a scandal affecting the party that changes the nomination probabilities and at the same time reduces the probability of winning the election for all of them. (By the way, I was joking about the “contrivedness” of the examples.)

Carlos

let me re-phrase that:

“Obviously becoming more nominate-able at t+1, increases overall electability, on average.”

I was thinking of the political & campaign skills, polish, support, etc. Clearly there are edge cases that we can cook up.

]]>The use of a slash / instead of a vertical bar | in writing a conditional probability seems to be preferred in the philosophical literature. I’ve read a number of philosophical papers on probability and inference and they invariably wrote P(A/B) rather than P(A|B). The vertical line seems to be universal amongst statisticians. They may be reading it “Probability of A on B” which would make sense.

]]>Daniel,

I’m deliberately trying to avoid talk of intervention since this seems to be confusing the discussion, but I think you and I have the same concept of electability. As you can see in my comment, I’m not necessarily talking about the characteristics they have today. I’m talking about our forecast of how their characteristics would evolve over the course of the election, were they not required to evolve in any particular ways to secure the nomination first. Andrew’s concept bakes into this forecast that things have already evolved in ways that make them the nominee. But when deciding who to nominate (whether as a voter, a donor, an activist, or whatever), we remove from consideration the factors that influence nomination itself, since we’re substituting our judgment for these factors.

]]>Carlos:

I see what you are getting at now. Your Canada-war example is very good.

Candidate B is more “electable” conditional on a war with Canada, but less “electable” conditional on no war with Canada. The problem is that Pr(Win|Nominated) measures only the first conditional probability. Our best estimate of “electability” today is a weighted average of the two conditional probabilities.

I have probably just muddied the waters. I do that a lot.

]]>Terry, the factors you mention are reasons for prices to fluctuate through diffusion through time. What I meant was if they announce the nominee on monday morning 1 minute before open of market and it’s based on the close of market prices on Friday, there would be a price shock at opening on monday, because the information that the candidate has been nominated is information that makes some of the scenarios incorporated into the previous p(win | nominated) pricing have zero probability, so it drops out of the mix, and other scenarios that were deemed relatively unlikely are now given much higher probability… the weighted average across all the scenarios changes instantaneously when that announcement is made.

This is not just a diffusive jiggling caused by a few people taking cold medicine or needing to exit their positions or getting new infusions of capital they need to invest in their funds or whatever.

This kind of thing happens for example when the Fed comes along and announces rate target changes… The market has priced in the average effect over all the possible rate changes they might make, but when one particular one is realized, and particularly when it’s one that was deemed unlikely by the market, it causes re-pricing throughout the market in a shock. This is why the Fed likes to hint at what its likely to do ahead of time so that prices equilibriate more slowly and people aren’t caught off guard.

If the Democratic party nominated a person based on pricing, and that candidate had relatively high p(nomination), the size of the shock on p(win|nominated) would be smaller. When the p(nomination) was small to begin with, the shock on p(win | nominated) would be potentially dramatic.

Ram: I think we’ve come to the heart of the matter. what is the definition you want to use of “electability”? it seems you want basically “how electable is the candidate if the candidate has the properties the candidate has now, but no longer has to vie for nomination”, whereas Andrew seems to want “what is the probability that the candidate who is nominated having the properties we can infer about that candidate at the later time of nomination, also then goes on changing their properties and eventually the election is held and the candidate wins”

Neither one seems realistically what we want to know. Really what we want is p(elected | we hand the nomination to the candidate now), which incorporates the ways the candidate would change through time between now and election, but not the effect of the nomination process on those changes… essentially like an ordinary differential equation which has two kinds of forcing: nomination campaigning, and election campaigning… and we eliminate the nomination campaigning

]]>Daniel:

Agree. New information is revealed all the time by all sorts of things, so I expect prices to change constantly. Even if no information is revealed by events you think should matter, prices will change because market participants do random things like change their minds for no good reason, sell for liquidity purposes, or start taking a nighttime cold medicine that makes them do wacky things.

]]>