Often you can get clarity on a philosophical question by cashing it out as a mathematical one. We can ask: since Bayes is often presented as an account of subjective or personal probability, what does it have to say about the kinds of scenarios in which frequentist techniques have had the most success, i.e., repeatable trials that are notionally random? The Bayesian perspective on such scenarios is that nothing distinguishes any one “random” trial from any other, so trials are exchangeable. If they are infinitely exchangeable (i.e., the number of exchangeable trials can grow without bound) it follows as a mathematical theorem that a Bayesian agent expects stable relative frequencies, and that the expected outcome relative frequencies are equal to the predictive probabilities of outcomes on *the next* trial. If it wasn’t already obvious that relative frequencies satisfy the usual probability axioms, we’d be forced to recognize the fact at that point anyway.

*application*. A more restrictive approach can actually be “more correct” as it may be closer aligned to what you want. ]]>

Amy argues Peirce (note correct spelling) is pluralistic not me!

http://muse.jhu.edu/journals/csp/summary/v045/45.3.mclaughlin.html

I would agree, but given that encourages inducement of a cult – I will decline!

Hey, in the long run we are all run – in the short run we can pretend not to know that.

]]>In the 8 schools example, the effects from schools are a physical ensemble so those can be considered as random but not the hyper-parameter! ]]>

re ‘taboo to using any random model for anything other than physical random mechanisms’

– not even if interpreted as an ensemble of deterministic models?

]]>See comment by David Cox on the Lindley and Smith paper – Bayes estimates for the linear model. 1972.

The argument seeming to be “common physical random mechanism” is what _needs_ to be modeled and that can be done without a prior (albeit accepting poor repeated sampling properties when data is sparse until better higher order asymptotics are somehow developed.) So drop the prior.

There does seem to be some taboo to using any random model for anything other than physical random mechanisms for any purpose (even non-literally) …

One of Andrew and my colleagues suggested this is due to over deference to RA Fisher’s dismissal of any one method of induction. That is hard to disagree with – induction cannot be deductive and Lindley’s recollection of the sacrosanct LP providing such a axiom based (deductive) inference probably painted Bayesianism as largely being induction made deduction.

There is also Fisher’s dismissal of work by Cochran (1937) and Yates (1938) on the analysis of repeated experiments using the Normal-Normal random effects model and variations on it.

]]>But weakening the axioms or adding others can lead to different, not-so-crazy calculi. See e.g. Norton, J.D. (2007). Probability Disassembled. Br J Philos Sci 58, 141–171. (Available here.) The paper suggests an analogy with Euclidean geometry:

]]>The fragility of these demonstrations is very similar to the failure of attempts to show that Euclid’s fifth postulate of the parallels is the only postulate admissible in geometry. These attempts started by denying Euclid’s fifth postulate in in the context of the other postulates, and inferring from the denial some unusual geometric propositions that, we were to suppose, are incoherent. It was eventually realized in the nineteenth century that the denial of Euclid’s fifth postulate involved no inconsistency; it merely led us to different geometries.

And as above that’s why I find starting from probability calculations and using exchangeability as a bridge to bayes far more convincing than most typical presentations. And just a nice way of presenting probability applications fullstop.

I’m curious [without wanting to spark another tiresome debate – external links rather than a long comment thread are fine ;-)] – are there any frequentist criticisms that directly address this approach? I have seen frequentists invoke exchangeability and then drop the priors…

]]>Some men induce a cult following: e.g. Jaynes, Judea Pearl, I J Good, C S Pierce.

]]>I think of probability-as-calculus-of-plausible-inference as derived from an analogy with finite populations (or, say, finite areas, as on a dart board), a la Bernoulli: “probability is degree of certainty and differs from absolute certainty as the part differs from the whole”.

]]>http://opinionator.blogs.nytimes.com/2010/04/25/chances-are/

]]>Of course, it’s also nice that the (assumed) requirements of many applications lead to the same mathematical model. But I like the idea of moving back and forward between the physical or application concept and the corresponding mathematical model, while keeping these separate.

]]>http://www.tandfonline.com/doi/full/10.1080/00031305.2014.951127#abstract

I’m not sure about the ‘grain of salt’ sentence in the abstract, but the tables in the article are clearly laid out. ]]>

“This literature has amply demonstrated that people actually can readily and accurately reason in Bayesian terms if the data are presented in frequency form, but have difficulty if the data are given as percentages or probabilities.”

I suspect that part (but not all) of the problem is that many people have problems with percentages and proportions, hence also with probabilities. Starting with frequencies can help them get over this hurdle.

2. Andrew said:

“Probability is a mathematical model with many different applications, including frequencies, prediction, betting, etc. There’s no reason to think of any one of these applications as uniquely fundamental.”

See http://www.ma.utexas.edu/users/mks/statmistakes/probability.html for an example of (one aspect of) how I usually handle the multi-faceted aspect of probability in teaching (undergraduate and graduate).

]]>Since the students were, except for membership in the honors college, from the general population, there could not be an assumption about mathematical ability beyond what being in the honors college implied, nor an assumption about major. In fact, most of the students were not majoring in STEM subjects. I had a significant number of pre-med and pre-law students, even a dance major once. Of course, there were some math/physics types as well (and three of them out of about 150 students that took it during those 9 offerings actually became professional statisticians, though that wasn’t the goal of the course).

The seminar used only finite state spaces (so no calculus) and was organized primarily as a course in Bayesian decision theory. I taught the probability using Gigerenzer’s ideas and his book, “Calculated Risks,” which I highly recommend for a context like this. I also usually used Hammond, Keeney and Raiffa’s “Smart Choices” and sometimes recommended other books.

I really think that Gigerenzer’s ideas about natural frequencies as an approach to teaching probability ideas work well and I highly recommend them. Incidentally, I have often run across folks who, when they learn that I’m a statistician, tell me how much they hated the course. I then tell them that that’s not the kind of statistics I teach, and give them a simple example (usually false positives of mammograms since it’s an easy example and can be done without even writing anything down), to which the response is something like “Wow, that’s so simple. I wish my course had been taught like that!”

The link to the course webpage the last time it was taught is here:

http://bayesrules.net/hcol196.html

There’s a link there to my course blog, which includes shots of the whiteboards taken by my iPhone. (There are other courses there as well…this offering was taught in the Spring of 2011 to make it easy to find the relevant entries).

]]>This sarcasm is unhelpful. I refer you to chapter 1 of BDA where we have several different examples of empirically determined probabilities.

]]>All those people who see probabilities as a calculus of plausible inference in the face of uncertainty just aren’t smart enough to see how inherently flawed this is.

]]>Probability as a model of observable relative frequencies can’t get you to probability as a calculus of plausible inference. Probability as a calculus of plausible inference gives you relative frequencies via de Finetti’s exchangeability theorem.

]]>There is animation and R code to generate better animations here https://phaneron0.wordpress.com/2012/11/23/two-stage-quincunx-2/

What I have learned from some webinars and first hand tutoring is that there appears to be some conceptual challenges. These seem to be the need to grasp abstract modeling required to represent empirical reality. For instance, in the animations there is a machine that represents how nature generated the observation and then a second machine to represent how an analyst would represent that and then work with their representation to say get an interval for an unknown. Other statisticians seem to get the need for two machines right away but others seem not to.

Given what I am doing is just inefficient simulation from the posterior, most problems could be done this way until they get too complex (i.e. for hierarchical models just a subset of 5 groups rather than all 30).

]]>Any of these models is general enough to include all the others: you can start with relative frequencies and derive prediction, betting, etc.; you can start with prediction and from that get relative frequencies, betting, etc.; you can start with betting; etc.

But the different models are more or less applicable depending on context. Relative frequency doesn’t make so much sense when absolute frequencies are low. Prediction doesn’t make much sense if you’re not actually making predictions. Betting doesn’t make sense if you consider selection of what bets you’re offered. But that’s ok. Probability is a mathematical model. Mathematical models aren’t perfect.

]]>How robust is the underlying empirical work, anyone know? After all the iffy Psych studies Andrew covers on the blog on a daily basis, I’m very leery of results that emanate from Psych Depts.

]]>If one of the applications was general enough to include many or all of the others as special cases, that would be a reason to think that it’s (more) fundamental, no?

]]>