The reason for log transforming your data is not to deal with skewness or to get closer to a normal distribution; that’s rarely what we care about. Validity, additivity, and linearity are typically much more important.

The reason for log transformation is in many settings it should make additive and linear models make more sense. A multiplicative model on the original scale corresponds to an additive model on the log scale. For example, a treatment that increases prices by 2%, rather than a treatment that increases prices by $20. The log transformation is particularly relevant when the data vary a lot on the relative scale. Increasing prices by 2% has a much different dollar effect for a $10 item than a $1000 item. This example also gives some sense of why a log transformation won’t be perfect either, and ultimately you can fit whatever sort of model you want—but, as I said, in most cases I’ve of positive data, the log transformation is a natural starting point.

The above is all background; it’s stuff that we’ve all said many times before.

What’s new to me is this story from Shravan Vasishth:

You’re routinely being cited as endorsing the idea that model assumptions like normality are the least important of all in a linear model:

Non-normality is relatively unimportant; at worst you just may lose a bit of power. I strongly recommend @StatModeling & Hill (2007, pp. 45-47)'s summary of key regression model assumptions. Normality of errors literally gets LOWEST priority. My experience supports this. 3/3 pic.twitter.com/R0BfQCoxdK

— Roger Levy (@roger_p_levy) December 8, 2018

This statement of yours is not meant to be a recommendation to NHST users. But it is being misused by psychologists and psycholinguists in the NHST context to justify analyzing untransformed all-positive dependent variables and then making binary decisions based on p-values. Could you clarify your point in the next edition of your book?I just reviewed a paper in JML (where we published our statistical significance filter paper) by some psychologists that insist that all data be analyzed using untransformed reaction/reading times. They don’t cite you there, but threads like the one above do keep citing you in the NHST context. I know that on p 15 of Gelman and Hill you say that it is often helpful to log transform all-positive data, but people selectively cite this other comment in your book to justify not transforming.

There are data-sets where 3 out of 547 data points drive the entire p<0.05 effect. With a log transform there would be nothing to claim and indeed that claim is not replicable. I discuss that particular example here.

I responded that (a) I hate twitter, and (b) In the book we discuss the importance of transformations in bringing the data closer to a linear and additive model.

Shravan threw it back at me:

The problem in this case is not really twitter, in my opinion, but the fact that people . . . read more into your comments than you intended, I suspect. What bothers me is that they cite Gelman as endorsing not ever log-transforming all-positive data, citing that one comment in the book out of context. This is not the first time I saw the Gelman and Hill quote being used. I have seen it in journal reviews in which reviewers insisted I analyze data on the untransformed values.

I replied that is really strange given that in the book we explicitly discuss log transformation.

From page 59:

It commonly makes sense to take the logarithm of outcomes that are all-positive.

From page 65:

If a variable has a narrow dynamic range (that is, if the ratio between the high and low values is close to 1), then it will not make much of a difference in fit if the regression is on the logarithmic or the original scale. . . . In such a situation, it might seem to make sense to stay on the original scale for reasons of simplicity. However, the logarithmic transformation can make sense even here, because coefficients are often more easily understood on the log scale. . . . For an input with a larger amount of relative variation (for example, heights of children, or weights of animals), it would make sense to work with its logarithm immediately, both as an aid in interpretation and likely an improvement in fit too.

Are there really people going around saying that we endorse not ever log-transforming all-positive data? That’s really weird.

Apparently, the answer is yes. According to Shravan, people are aggressively arguing for not log-transforming.

That’s just wack.

Log transform, kids. And don’t listen to people who tell you otherwise.