Occam

Regarding my anti-Occam stance (“I don’t count ‘Occam’s Razor,’ or ‘Ockham’s Razor,’ or whatever, as a justification. You gotta do better than digging up a 700-year-old quote.”), David Gillman writes:

I was at your talk at MIT yesterday, and something bothered me until I realized just now that your reason for rejecting Occam’s Razor was wrong, from a Bayesian point of view. A priori what’s the probability that something somebody says will be remembered for 800 years? I figure it’s machine learning people who want your models to be simple, but Occam’s answer to that would be that you aren’t a machine.

He also says,

If somebody quotes ancient wisdom and you disagree with them, Occam’s Razor says don’t blame the ancient wisdom, because the person is probably misappropriating it.

Good point.

8 thoughts on “Occam

  1. Parsimony isn't just a matter of simplicity, is it? I would view AIC as a criterion for selecting parsimonious models, yet it does not always select the simplest one; it balances likelihood against model complexity. If there are 2 models, one with a single parameter and one with 10 parameters, the preference depends on study objectives as much as any statistical consideration. First, if covariates cost me money, I'll take the single parameter model if it gets the job done (e.g. helps me make a good decision). Second, even if the data is free, I may also choose the simpler model if I ultimately have to explain my model to, say, a politician. Sometimes dimension reduction is inherently valuable.

  2. If somebody quotes ancient wisdom and you disagree with them, Occam's Razor says don't blame the ancient wisdom, because the person is probably misappropriating it.

    Good point.

    I disagree that this is a good point. How is it simpler (or more parsimonious) to posit that 'ancient wisdom' has been misappropriated than to posit that it's wrong?

    The typical paraphrase of Occam's razor captures a key point that is obscure in (or missing from) the original, namely that 'all other things being equal' is a necessary precondition for a consideration of simplicity.

  3. I disagree that this is a good point. How is it simpler (or more parsimonious) to posit that 'ancient wisdom' has been misappropriated than to posit that it's wrong?

    I believe this goes to another famous razor: "Never attribute to malice that which can be attributed to stupidity", or its funny cousin, "cock-up before conspiracy". I'm far more willing to consider one person's misinterpretation than those of thousands before them.

  4. It sounds like they're confusing Occam's Razor with the sort of simplification Galileo did when working with falling bodies. IIRC, Cushing, in "Philosophical Concepts in Physics," discussed how if Brahe had better astronomical equipment, Kepler wouldn't have been able to form his three planetary laws. Clearly a case where more parameters add to the model, but put the problem beyond the state of the art.

    Occam's Razor is irrelevant to that sort of issue. Occam's Razor has to do with parameters that add no explanatory power or bring no new information. "God doesn't want his worshipers flattened" isn't going to improve the engineering of a cathedral, but comparing the compressive strengths of different types of granite probably will. It sounds like the machine learning people want to assume that hornblende-granite is essentially the same as tourmaline-granite when considering the cross-section of a flying buttress, while you think the difference is worth exploring.

  5. In your previous post, you said "I've never seen any good general justification for parsimony".

    Here's one: Bentler, P.M. and Mooijart, A. (1989). Choice of structural model via parsimony: a rational based on precision. Psychological Bulletin, 106(2):315-317.

    The abstract says (in its entirety, it's short):

    "It is shown that, in large samples, the more parsimonious of two competing nested models yields an estimator of the common parameters that has smaller sampling variance. The use of parsimony as a criterion for choice between two otherwise acceptable models can thus be rationalized on the basis of precision of estimation."

    The paper is talking about structural equation models, but hey! they're just multilevel models thought about differently. (Possibly.)

  6. The principle of Occam's Razor, when used correctly, says that given equal explanatory power the simpler explanation tends to be the correct one. So if the simpler theory doesn't explain what you are looking at as well it is no time to invoke Occam's Razor. However, if you have competing theories that both offer good explanations of the phenomenon the principle of Occam's Razor can lead one toward the simpler one.

  7. Jeremy,

    This might be true if the models are estimated using least squares or something similar, but not if they are estimated using prior information.

    Anonymous,

    I completely disagree. As Radford wrote:

    Sometimes a simple model will outperform a more complex model . . . Nevertheless, I [Radford] believe that deliberately limiting the complexity of the model is not fruitful when the problem is evidently complex. Instead, if a simple model is found that outperforms some particular complex model, the appropriate response is to define a different complex model that captures whatever aspect of the problem led to the simple model performing well.

Comments are closed.