Thanks for all the comments. I responded here. To summarize briefly:

1. Many people commented that the laws of probability work just fine in quantum mechanics, you just have to include the act of measurement in your model: there is no latent joint distribution that exists out there to be passively measured.

I agree, but my point was that when we apply probability theory to analyze surveys, experiments, observational studies, etc., we typically *do* assume a joint distribution and we typically *do* treat the act of measurement as a direct application of conditional probability. If classical probability theory (which we use all the time in poli sci, econ, psychometrics, astronomy, etc) needs to be generalized to apply to quantum mechanics. Which makes me wonder if it should be generalized for other applications too.

2. Some commenters discussed work in political science and psychometrics in which researchers are working on generalized probability models, inspired by quantum probability, to do statistical data analysis. Looks like it could be interesting.

P.S. Just to clarify further: I know more physics than most statisticians do, but that’s not a lot, and I certainly don’t think I have anything useful to say about quantum mechanics beyond what Richard Feynman (or, for that matter, Bill Jefferys) has written already. Where I do have expertise is in the application of probability models to diverse applied fields. And what I’m wondering is whether it would be appropriate to generalize the usual probability models there, just as it is necessary to do for quantum mechanics.

Andrew: This has been a very interesting discussion. Here are some additional items that some who are following it may find of interest.

As one commentator (rege) pointed out, Wigner suggested expanding probability theory to allow negative densities. This is obviously not standard probability theory, but the interesting conflicts that are well known in quantum mechanics can be resolved this way.

Saul Youssef, at B.U., has proposed another formalism, where probabilities are allowed to take on complex values. This also can resolve the conflicts. Note: This idea should not be confused with the complex wave function that is the solution of the Schroedinger equation, and which has the property that the absolute square of the wave function (i.e., the product of the wave function and its complex conjugate) can be interpreted directly as a probability. Youssef's complex quantities obey axioms that are analogous to those of standard probability theory, but appropriately generalized.

It should be noted that the solution of the Schroedinger equation (or equivalent) yields probabilities as noted above; but the Schroedinger equation will itself depend on the experimental setup, including any decision as to what we decide to measure, so in essence these probabilities do condition on the experiment being performed, as Leslie Ballentine advocates.

The thing is, that in quantum mechanics you just can't treat the experimental equipment and the test particles as separate; they are entangled, so that there is really only one system consisting of equipment + test particles, and this is why (in my view) the simplest resolution of the problem is still Ballentine's.

Of course, the standard discussion of this is Bell's theorem, the famous "no-go" theorem that says "no hidden variables," i.e., no joint distribution.

Finally, there is the very interesting example of the Kochen-Specker theorem. This theorem displays an example where we have the conditional distributions (over a finite state space, no less), a set of distributions that represents a particular quantum system, and yet no joint distribution exists from which these conditionals can all be derived. This makes the point that Andrew and I made, but for a particularly simple system.

About 10 years ago I mentioned the Kochen-Specker theorem to Jim Berger, who nonchalantly pointed out to me that even in classical probability you can have a full set of conditionals with no underlying joint distribution. His example was P(x|y)=N(y,1), P(y|x)=N(x,1) where N(a,b) is normal with mean a and variance b. The reason why there is no underlying joint distribution here is of course that the obvious distribution is not normalizable. If you attempted to obtain a sample from "the joint distribution" by Gibbs sampling, the attempt would fail, since the problem is unidentified and the samples would wander all over the place. The reason why there isn't a joint distribution here is different from the Kochen-Specker example, but it's worthwhile reminding ourselves that even very simple examples exist of full conditionals without joints.

I thank Andrew for provoking a very interesting and fruitful discussion.

Bill: Yes, I know about examples where you can write conditional distributions and there is no underlying joint distribution. But the two-slit example is different: it's not just that you can

writethese conditional distributions. You actuallyobservethem. In a classical statistical setting you won't simultaneously observe x|y ~ N(y,1) and y|x ~ N(x,1) for all x and y.Andrew: Right, that's what makes the Kochen-Specker theorem so interesting. It's very different from the example that Jim Berger gave me.

Here's a practical example where I think political surveys suffer from the same problem. A recent poll asked this question:

I don't think you can measure public opinion on a question like that without affecting the results.

At least I think this is similar, but I know almost nothing about physics and only a little more about statistics.

I'm still unclear as to the nature of the conflict; consider this situation:

Suppose drug dealers routinely enter Central Park from the SE entrance and travel to the NW exit, and drug users enter from the SW entrance and travel to the NE exit.

Experiment 1: Close the SW entrance and have police search everyone who comes out of the park; record the distribution of drugs and call it p1(y).

Experiment 2: Close the SE entrance and have the police search everyone who comes out of the park; record the distribution of drugs and call it p2(y)

Experiment 3: Open both entrances and have the police search everyone who comes out of the park; call it p3(y)

Experiment 4: Now have an extra detachment of police sit at the two entrances and write down how many people go through the entrances. Assuming an efficient market, etc, the probability of a person entering the park going through a particular entrance will be 1/2. You can also record the distribution of drugs at the far end of the park as before, thus you can get p4(y), and you can also observe the conditional distributions, p4(y|entrance=SE) and p4(y|entrance=SW). You'll find that p4(y|entrance=SE) = p1(y) and p4(y|entrance=SW) = p2(y). You'll also find that p4(y) = (1/2) p1(y) + (1/2) p2(y). So far, so good.

But wait, now we have the same problem as before — p3 does not equal p4! The dealers and users noticed the policemen counting them at the entrance and therefore changed their behavior in the park — conditional probability must be broken!

I'm unconvinced there is anything wrong with conditional probability from this example; what substantive difference is there between your quantum example and this one?

Actually, a lot of surveys, especially those that focus on issues, have this problem. The difficulty is caused by the fact that the respondents often don't know the information contained in the question until they hear the question, so the act of asking the question can alter their opinion on the topic in a "fair" (as in, the survey wasn't trying to alter the opinion) manner. Then, of course, your responses are very likely biased relative to the (more ignorant) population at large.

E.g., "20M people are expected to contract disease A this winter. A comprehensive program of shots is expected to reduce this number by 6M, but cost $3B. Are you for/ neutral/ against/ undecided …"

Hence, well-designed surveys usually try not to convey any information to the respondents at all, to be on the safe side. Naturally this also limits what you can find out.

I stumbled across a paper on the arXiv that seems relevant to Andrew's point.