One of the discussants in *Brain and Behavioral Sciences* of Seth Roberts’s article on self-experimentation was by Martin Voracek and Maryanne Fisher. They had a bunch of negative things to say about self-experimentation, but as a statistician, I was struck by their concern about “the overuse of the loess procedure.” I think lowess (or loess) is just wonderful, and I don’t know that I’ve ever seen it overused.

Curious, I looked up “Martin Voracek” on the web and found an article about body measurements from the British Medical Journal. The title of the article promised “trend analysis” and I was wondering what statistical methods they used–something more sophisticated than lowess, perhaps?

They did have one figure, and here it is:

Voracek and Fisher, the critics of lowess, are fit straight lines to data to clearly nonlinear data! It’s most obvious in their leftmost graph. Voracek and Fisher get full credit for showing scatterplots, but hey . . . they should try lowess next time! What’s really funny in the graph are the little dotted lines indicating inferential uncertainty in the regression lines–all under the assumption of linearity, of course. (You can see enlarged versions of their graphs at this link.)

As usual, my own house has some glass-based construction and so it’s probably not so wise of me to throw stones, but really! Not knowing about lowess is one thing, but knowing about it, then fitting a straight line to nonlinear data, then criticizing someone else for doing it right–that’s a bit much.

**Not just lowess**

Just to be clear, when I say “lowess is great,” I really mean “smoothing regression is great”–lowess, also splines, generalized additive models, and all the other things that Cleveland, Hastie, Tibshirani, etc., have developed. (One of the current challenges in Bayesian data analysis is to integrate such methods. Maybe David Dunson will figure it all out.)

And it looks pretty.

Seriously, though, is using lowess on plots solely an illustrative technique? After all many if not most of the models fit in political science, at least, are linear. So if you're fitting a linear model, isn't showing a lowess line on a plot almost cheating?

Jeff Morris has done some nice work developing an adaptive Bayesian approach to non-parametric regression.

Robert Kass has developed an adaptive Bayesian model for smoothing splines.

Johnstone and Silverman have recently published an empirical Bayes method for threshold selection in wavelet regression.

On the other hand, I remember reading a paper in some electrical engineering journal where the authors used loess to fit internet data. Given the scarcity of samples in the long tail, the fit was essentially meaningless even if it is non-linear!

Further Googling of Martin Voracek would have turned up this gem: National intelligence and suicide rate: an ecological study of 85 countries.

Now here's an article that has it all in terms of statistical fallacies. He attempts to argue that intelligence is a causal factor for suicidality using the following methodology: Each country in the world is assigned an IQ and this national IQ is correlated with national reported suicide rates. It's really amazing – he's managed to incorporate the ecological fallacy, reporting bias, selection bias, profound confusion about the definition of IQ (a measure of intelligence

relative to a typical individual), unmeasured confounders, terrible measurement methodology and just a generally goofy scientific approach, all in one bogus study! There should be an award for this.Um, Boris, just because the model you use is linear doesn't meant the underlying system you try to model is. Rather, I'd (naively, not being a statistician, nor a social scientist) expect linear relationships to be the exception; it usually is after all.