I think that the Quine-Duhem thesis is closely related to Carnap refutation of falsification. I do not know the historical sequence, but Quine and Carnap worked closely with each other, and both were aware of the indeterminacy problem very early on.

]]>Quite. No doubt you’ve seen this:

To belabor the point, because experience shows that it is necessary: In our scholastically correct terminology, a probability p is an abstract concept, a quantity that we assign theoretically, for the purpose of representing a state of knowledge, or that we calculate from previously assigned

probabilities using the rules (1) – (3) of probability theory. A frequency f is, in situations where it makes sense to speak of repetitions, a factual property of the real world, that we measure or estimate.

What is the matter with people and probability theory and its interpretation, I wonder. Similarly, many people can’t seem to make – and maintain – the even more important distinction between (epistemic) probability space state and (ontic) configuration or phase space state in (applied) non-classical probability.

]]>Sorry, that should be *Carnap*’s point about existential quantifiers.

]]>Yes, Popper’s logical point about existential quantifiers is well taken. However, in my view, the best refutation of Popper’s deductivism is via the Quine-Duhem thesis: that is, in practice all refutation is relative to background/auxiliary assumptions (or “conditional on” background assumptions, to put it in Bayesian terms). But if you allow that refutation is conditional, there’s really no good reason not to also allow conditional confirmation.

]]>A likelihood is conceptually distinct from an “F probability.” Conceptually, the likelihood is merely (proportional to) the probability of the *observed* data, given each possible parameter value. But usually the likelihood will be derived from a model of how a range of *possible* data are generated. Moreover, if you are going to do posterior predictive checking (or a prior predictive analysis), then you *need* a model of the data generating process, and not just the probability of the observed data given each parameter value. And if the model of the data generating process disagrees too much with the distribution of the actual data, then (by the logic of posterior predictive checking and prior predictive analysis), the model should be rejected. So in practice, I think it’s fair to say that “F probabilities” are very important to Bayesians, even if (strictly speaking) an application of Bayes’s formula merely requires an estimate of how probable each parameter makes the observed data.

]]>“I never really “got” induction. I think what I do is abduction (guessing), then deduction to work out the consequences of the guess. Then if the guess is good, the predictions should be good.”

To put it simply, induction is simply going in the other direction: if the predictions are good, then the guess is good.

]]>Pierce thought induction was was the inference to the conclusion that the frequency of the trait t in my sample is the same as its frequency in the population. Put another way, I have enough evidence to generalize. I think that is distinct from deductive and abductive reasoning although it may be that in many areas of the social sciences only abductive inferences are available. However, deductive reasoning itself is never going to be sufficient. There has to be an inductive inference somewhere to justify the accepting or rejecting the hypothesis. So, the hypothetical-deductive model is not a complete description of scientific inference.

]]>Yes, Meehl and Lakatos ftw. Actually I am not sure if I learned about Lakatos from Meehl or this blog. Great paper.

Where does induction fit into this story? Popper has argued (convincingly, in my opinion) that scientific inference is not inductive but deductive, that the way we generalize from particular cases is through the medium of models, and that inference within a model is deductive (see also Greenland 1998).

I never really “got” induction. I think what I do is abduction (guessing), then deduction to work out the consequences of the guess. Then if the guess is good, the predictions should be good. I guess maybe induction is where the guesses are coming from?

]]>Anon:

Lakatos is my hero. I allude to his multiple Poppers in footnote 6 of this article of mine which I absolutely love.

]]>Check out Imre Lakatos:

Popper0 is the imaginary author of a vulgarised version of Popperian philosophy of science, a phantom created by Ayer, Medawar, Nagel and others. I discuss him only because he is much more widely known than the more sophisticated Popper1 and Popper2.

[…]

Popper0’s position-as Popper constantly stresses-is untenable: ‘no conclusive disproof of a theory can ever be produced’.

[…]

Let us call a series of theories each of which is also acceptable2-that is, each of which produces ‘facts’ not entailed by its predecessor-an (empirically) progressive shift. But if theories are falsified all the time, they are problematic all the time, and therefore we may speak of progressive problem-shifts. If the problem-shift is not progressive. we call it degenerating. If we put forward a theory to resolve a contradiction between a previous in such a way that the new theory, theory and a counterexample instead of offering a-content-increasing-scientific explanation43, the only offers a-content-decreasing-linguistic reinterpretation, contradiction is resolved in a merely semantical, unscientific way. Popper2 forbids the use of such unscientific content-decreasing stratagems.44 Then Popper’s (Popper2’s) celebrated demarcation criterion can be reformulated as contrasting progressive (scientific) and degenerating (pseudo-scientific) problem-shifts.

Lakatos, Imre. 1968. Criticism and the methodology of scientific research programmes. Proceedings of the

Aristotelian Society. Issue 69, p. 149-186. personal.lse.ac.uk/robert49/teaching/ph201/week05_xtra_lakatos.pdf

tl;dr

Theories that make surprising new predictions, ie have a high p(theory|data) since all the other factors in the denominator of Bayes rule are so low, advance science. If you are always making ad hoc adjustments to your theory (the theory lags the observations) then it is pseudoscience.

What I mean is that nothing about what seems plausible to you can possibly affect how the world actually behaves (mind projection fallacy). So if you set up a model and say the P(Data | Model, parameters) = normal(x,0,1) nothing about this fact or how often you are willing to claim this fact will affect the physical process that makes it have some totally different frequency distribution which you don’t know because you’re not collecting tens of thousands of data points over multiple years in all possible weather conditions, etc.

]]>“In using Bayes’ rule one does not put a frequency-probability into a belief-probability context, but rather a belief-probability that is induced by a frequency-probability.”

This is a good way of putting it. To elaborate (or perhaps rephrase): The prior needs to be based on whatever credible prior knowledge one has; if there is credible frequency-probability available, that is reasonable information to include in the prior belief-probability (in fact, excluding it would be unreasonable).

]]>In all other cases the plausibility remains different from frequency, usually in an unknown way…

I think you only need infinite exchangeability to get a law of strong numbers — that is to say, to have the theorem that expected frequency is numerically equal to plausibility.

]]>@Andrew, there’s lots of food for thought in the linked texts, but I don’t think they address the question posed by Ed Bein.

He correctly points out that probability theory is a set of properties that may apply to different things, in this case “long-term frequency” and “degree of belief/knowledge/plausibility”, and the fact that these properties apply to both of them doesn’t justify plugging these different things into the same formula (Bayes’ rule).

Daniel Lakeland’s reply addresses this concern. In using Bayes’ rule one does not put a frequency-probability into a belief-probability context, but rather a belief-probability that is induced by a frequency-probability. In the rare case in which we can reasonably know the frequency-probability, this leads to degrees of belief that a realization of the random process has a given outcome. I think it would be great if someone could work this out in detail, especially for cases where “reasonably knowing the frequency-probability” is not so simple.

]]>Andrew,

I consider you one of the most intellectually sharp and interesting thinkers on this planet. I meant ‘careful reading’ as in being paying full attention to definitions & interpretations. You know researchers come in all varieties and competencies. And even the best of readers can be distracted from making appropriate findings.

I have followed a considerable amount of research in three areas: nutrition, kinesiology, & hormones. So maybe I had some background to evaluate Cuddy’s thesis for example. But I don’t pretend that I am right in my interpretations either. I am relieved when someone dissuades me from an inaccurate position I hold.

If you notice I was referring to informal discussions as well, where I think researchers’ biases are more readily discernible.

It may also be that as a non-statistician I am forced to rely on verbal explanations, which someone Taleb characterizes as inadequate. But he tends to make a lot of verbal evaluations too.

]]>Sameera:

Sure, Wansink’s reviewers “weren’t sharp enough.” But . . .

1. Until a few years ago, I “wasn’t sharp enough” either, in the sense that I don’t think I would’ve caught those problems either.

2. Nutrition and behavior research is important. It affects policy! If the reviewers in that field are, like me, not so sharp, that’s something we have to deal with. So I think the advice to read more carefully can be part of the solution, but another part has to be giving people some sense of how to read, what to look for, what evaluations make sense, etc.

We have to move beyond the existing approach of check boxes for:

(a) statistical significance,

(b) causal identification,

(c) novelty,

(d) the wow factor.

Think about it: Wansink did well in all four of those dimensions! He always had statistical significance (or so it seemed), he had randomized experiments thus causal identification for free (or so it seemed), and of course he had novelty and the wow factor. Sure, he got some of this by cheating, but, as we’ve discussed on the blog, had he put a bit more effort into each project, he could have gotten all these results without cheating, just using standard forking paths and storytelling of the sort that is *still* considered acceptable in scientific publications.

Sorry, some % has…..

Eeeekkk.

]]>We learned many of the same lessons about research methods & tests from the Evidence-Based Movement during the 90s, which, btw, have resurfaced in the last decade or more. So some of us are not so ill-equipped to think critically about the literature. I do not have the statistical competencies you possess. However, when I talk to statisticians informally, I can discern that they stray from their stated assumptions.

In so far as Brian Wansink’s reviewers, I would guess they weren’t sharp enough. I’m suggesting that if you countenance rigor as integral then ‘careful reading’ is entailed. I guess what I’m suggesting is that the qualitative side seems to have to take a back bench to the quantitative side. Rex Kline conducted a survey of academics knowledge of statistics. So there are some indicators that some % have been winging it in their teaching.

]]>Sameera:

Careful reading helps, but you need to know what to look for. We’ve learned so much in the past decade about what to look for. I’m much more aware of researcher degrees of freedom and forking paths, and just the general point that just cos something’s published in a respected journal, it doesn’t have to be any good. There are lots of papers where now it’s clear that there are problems, but ten years ago I and others would’ve just taken their claims at face value. Just remember: all of Brian Wansink’s published papers were read carefully by at least three reviewers . . . who didn’t notice any of the in-retrospect-obvious problems.

]]>Replication is a means to revisiting the state of theory & practice in these fields. And we are benefiting already. Some small subset may be able to offer up new more substantive insights in the process.

Lastly, all of us can gain if we can convey our viewpoints well. Convoluted writing pervades journals.

]]>