## John Cook: “Students are disturbed when they find out that Newtonian mechanics ‘only’ works over a couple dozen orders of magnitude. They’d really freak out if they realized how few theories work well when applied over two orders of magnitude.”

Following up on our post from this morning about scale-free parameterization of statistical models, Cook writes:

The scale issue is important. I know you’ve written about that before, that models are implicitly fit to data over some range, and extrapolation beyond that range is perilous. The world is only locally linear, at best.

Students are disturbed when they find out that Newtonian mechanics “only” works over a couple dozen orders of magnitude. They’d really freak out if they realized how few theories work well when applied over two orders of magnitude.

It’s kind of a problem with mathematical notation. It’s easy to write the equation “y = a + bx + error,” which implies “y is approximately equal to a + bx for all possible values of x.” It’s not so easy using our standard mathematical notation to write, “y = a + bx + error for x in the range (A, B).”

The more general issue is that it takes fewer bits to make a big claim than to make a small claim. It’s easier to say “He never lies” than to say “He lies 5% of the time.” Generics are mathematically simpler and easier to handle, even though they represent stronger statements. That’s kind of a paradox.

1. John Cook says:

In analysis you can say things like sin(x) = x + O(x^3). There you have your linear approximation and your error disclaimer all in one tidy package.

Of course it’s much harder to do anything like this in statistics. You don’t know over what range a model holds. Even if you know the range over which it was fitted, estimating how it rapidly might break down outside that range is highly context-dependent. Maybe you have a good idea or maybe you don’t have a clue. It’s not anything near as simple as reporting asymptotic order.

2. jim says:

Of course in physics the units of “order of magnitude” are distances, masses and derivatives (pressure, momentum, force bla bla).

So if we’re assessing the order of magnitude range of effectiveness of, say, nudge interventions, what’s the unit that scales over orders of magnitude? What’s the relevant unit in Carol Dweck’s 30 minute intelligence-can-be-improved educational intervention that would be subject to order of magnitude range? Speaker volume? Length of intervention? Number of words? Pixel per frame?

3. morris39 says:

Newton’s laws of motion are amazing in their simplicity and accuracy (large/slow objects in our planetary energy field) but not good for small and very fast objects (e.g. photons) QM works very well for the very small objects but not for the large and slow. It shows no imminent promise (100 years) although it is asserted without doubt that it is correct for all things in all circumstances.
Scale (energy density) seems to be a relevant factor for gravity. Newton’s works all the way out to Mercury. Einstein’s works everywhere in our planetary system but fails in distant galaxies. Solitary photons misbehave when up against test slits but are very orderly when in their usual trillions (atomic clocks say).
So what? On their face these sorts of observations not obviously wrong, maybe irrelevant but will not get any response.
Consensus is important.

• Andrew says:

Morris:

The point is that when people see a formula of the form y = f(x), they by default often assume it holds for all values of x. That’s how we write things in mathematics. But for real-world formulas, in social science as well as physics, there’s no reason to assume that. It can take awhile for people to realize this.

• David Marcus says:

You can use standard function notation, f: D -> R, where D is the domain and R is the codomain, rather than introducing x and y.

• Anonymous says:

Assuredly. Applicability of the solution is constrained by recognizing the boundary and starting conditions of the equation.
Maybe not surprising that similar constraints apply in ordinary situations (embarrassing obviousness). Why is it so tempting then to claim otherwise?

• Antonio says:

It seems to me like “there are many data/phenomena that follow the normal distribution, but not in the tails”. This holds only just within 2 or 3 sigma. Over that, it is no longer true…

• I am not Newton says:

How about this… I’ve been thinking about Newton’s first law, and perhaps someone who knows better can educate me… but doesn’t it describe a situation which is impossible? I mean there will never be such a situation in which no other external force. As a matter of a fact, I don’t even know what that would mean for objects that are constituted by a swarm of smaller objects, like atoms, or waves or whatever. So in what sense the 1st law is true? It can’t be, if there were no external forces, there would be no existence and no point in thinking about motion. But it works… we can work with that… it’s like the univers longs for non-existence…

• Dzhaughn says:

It’s not true, it’s just useful.

• When dropped from a height of ~ 1 m in air, any object between a BB ~ 0.1g and chunk of concrete ~ 10000000 g, will accelerate according to dv/dt = 9.8m/s^2 to within a negligible error that’s 8 orders of magnitude of mass. If you suck the air out of a special test chamber you can get it to work for feathers and sheets of paper and such https://www.youtube.com/watch?v=E43-CfukEgs

That’s incredibly impressive compared to say predicting the effect of cancer drugs on longevity.

• I am not Newton says:

That is true.

However, in predicting cancer people usually understand that we deal with uncertainty, whereas Newton’s laws are usually presented as something that is absolutely true. Not as an approximation; quite the contrary: that the noisy world we live in is presented as an approximation of those laws, which are seen as something more fundamental.

Maybe readers of this blog are too sophisticated to be fooled in such a way, but I think the topic of this post does reinforce the idea that – say – Newton’s laws are usually conceptualized as absolute truths, and people might be shocked to find that they are not. This is paradoxical in the sense, since many people know that the theory of relativity is better – in that it is more accurate – but still, I think, they do not apply this to the Laws (Newton’s) themselves. They might think that relativity is a better approximation of the world that is ultimatly governed by the Laws.

• If I move with a constant velocity v for some amount of time, I travel a distance vt. You could reply that this “law” that distance is velocity x time is not “true” because “there will never be such a situation” in which velocity is constant; in reality we are constantly buffeted by “a swarm of smaller objects, like atoms, or waves or whatever” that make our velocity fluctuate. But this would be a silly objection. The point of distilling the motion into distance = velocity x time is that it captures the essence of what’s happening, and is a framework on which to add further complexity. No one is shocked that mindlessly applying can lead to inaccuracy.

Similarly with Newton’s laws. Only the worst physics class would treat them as God-given laws to be mindlessly applied such that one is “shocked” that real objects have multiple forces acting on them. Rather, the brilliance of Newton’s laws is that this complexity can be whittled away — experimentally or theoretically — giving a framework that is very simple, and onto which complexity can sensibly and naturally be added.

• I am not Newton says:

I obviously agree with you general point, and I my point is not that we should clutter well working models with all sorts of (unnecessary) stuff; ships sailed to moon by applying the (relatively) simple formulae, no need to adjust for the position of Pollux in the distance. This is well understood.

My point is more philosophical and linguistic. What do we mean by e.g. that some law captures the “essence of what’s happening”. Of course I understand that you mean “discarding unnecessary details”, but “essence” can also mean something metaphysically fundamental; something without which the object of interest would no more be the same; its essence…

If someone read your sentence with that more metaphysical definition of “essence” in their mind, they would have a completely different picture in their mind. To them the sentence would read as laws capturing some divine unchanging truth about the universe.

(This ties in with the old debate about whether laws are “discovered” or “invented”; are we uncovering hidden truths or building practical tools; realism versus instrumentalism…)

In my experience many people – not physicists, “ordinary” people – do easily interpret such phrases in the more metaphysical way. It’s a bit like those pictures in which you can either see a face or a vase: once you are accustomed to seeing it one way, it takes some effort to change your perception.

Of course this is nothing special to discussion about science or laws. Communication is always noisy and ambigous, and reliant on some shared understanding of the concepts used.

To Martha:

Fair enough, it is not as simple as I made it out to be.

• Martha (Smith) says:

I am not Newton said:” in predicting cancer people usually understand that we deal with uncertainty,”

I’m not so convinced about this. For example, physicians often say that someone “is at risk for cancer” (or some other disease or injury) … Many of them consider this as a binary variable. — either someone is “at risk” or “not as risk”. (Of course, most readers of this blog realize that there are degrees of risk, as well as different ways of measuring risk — but I think a lot of physicians don’t.)

• David Marcus says:

Not sure what you mean by “QM works very well for the very small objects but not for the large and slow” or “Solitary photons misbehave when up against test slits”. For large and slow objects, quantum mechanics reduces to Newtonian mechanics. Solitary photons act similarly to solitary electrons, although I don’t know if anyone has worked out all the formulas.

4. Michael Watts says:

I don’t see a big difference between how easy it is to say “he never lies” and “he’s usually honest”.

I also disagree that “y = a + bx + error” implies that y is approximately a + bx for all values of x. It says that y is approximately a + bx for all values of x _for which error is approximately zero_, which is a very different claim.

5. Interesting post. My comments are from the standpoint of trying to guess how big a difference it would make to be more explicit about scale.

>The more general issue is that it takes fewer bits to make a big claim than to make a small claim. It’s easier to say “He never lies” than to say “He lies 5% of the time.” Generics are mathematically simpler and easier to handle, even though they represent stronger statements. That’s kind of a paradox.

A few months back I was thinking about how it’s kind of funny that multiple proposed reforms for reporting statistical analyses, like in social science, propose more complete uncertainty depiction as though its the obvious answer without really acknowledging how bad the average person/researcher is at maintaining the appropriate level of uncertainty when they consume claims. Maybe its a hardware issue, where communication and even thinking gets too hard when we have to maintain contingencies in mind. and so its very hard to communicate without brushing aside some of the second order details. (Some of this discussion reminds me of partial identification in economics as well, which criticizes downplaying the assumptions we make in parameter estimation).

At the same time, I think that many people, when consuming claims, do reserve some disbelief. It’s just more implicit and doesn’t always correspond to the limitations that we can mathematically define. So while I see saying ‘he never lies’ as a stronger statement on the surface, I’m not sure its reception by most people would be all that different from the modified version. And isolating how different it is behaviorally can be hard, because the noise in trying to measure the difference might be bigger than the effect.

Also reminds me a bit of the role that axis scaling plays in consuming graphs – it’s often overlooked by viewers and even researchers how much what you infer from a graph depends on the rules used to determine appropriate axes ranges.

• jim says:

“how bad the average person/researcher is at maintaining the appropriate level of uncertainty when they consume claims. “

We know this is true. But suppose the average researcher always had accurate information about uncertainty and there was a demand for them to understand that information. Would their intuitive conception of it improve? That would be an interesting line of research.

“So while I see saying ‘he never lies’ as a stronger statement on the surface, I’m not sure its reception by most people would be all that different from the modified version.”

So true, hilarious, who actually believes a statement like “he never lies”?

The default (well, at least for me) was to think of physics as describing “laws of the universe” but statistics and social science to describe “patterns.” We expect laws of the universe to be universal, so when they break down it’s disappointing. No one expects patterns to be universal.

• Andrew says: