In statistics, we learn about Type 1 and Type 2 errors. For example, from an intro stat book:
A Type 1 error is commtted if we reject the null hypothesis when it is true.
A Type 2 error is committed if we accept the null hypothesis when it is false.
(Usually these are written as I and II, in the manner of World Wars and Super Bowls, but to keep things clean with later notation I’ll stick with 1 and 2.)
Actually, though . . .
Never a Type 1 or Type 2 error
I’ve never in my professional life made a Type I error or a Type II error. But I’ve made lots of errors. How can this be?
A Type 1 error occurs only if the null hypothesis is true (typically if a certain parameter, or difference in parameters, equals zero). In the applications I’ve worked on, in social science and public health, I’ve never come across a null hypothesis that could actually be true, or a parameter that could actually be zero.
A Type 2 error occurs only if I claim that the null hypothesis is true, and I would certainly not do that, given my statement above!
But errors nonetheless
But I certainly have made errors! How can they be classified? For simplicity, let’s suppose we’re considering parameters theta, for which the “null hypothesis” is that theta=0. (For example, theta could be a regression coefficient, or a comparison between two treatment effects. In any given study, there might be many thetas of interest.)
A Type S error is an error of sign. I make a Type S error by claiming with confidence that theta is positive when it is, in fact, negative, or by claiming with confidence that theta is negative when it is, in fact, positive. I think it’s fair to say that classical 2-sided hypothesis testing fits this framework: for example, if our 95% interval for theta is [.1, .3], or if we say that theta.hat = .2 and is statistically significantly different from zero, then our scientific claim is that theta is positive, not simply that it’s nonzero.
A Type M error is an error of magnitude. I make a Type M error by claiming with confidence that theta is small in magnitude when it is in fact large, or by claiming with confidence that theta is large in magnitude when it is in fact small. The well-known problem of publication bias could lead to systematic Type M errors, with large-magnitude findings more likely to be reported.
Does this matter? If we just do straight Bayesian inference with continuous prior distributions and work with posterior inferences, then it’s not really so important. If we want, we can compute Type S and Type M error rates corresponding to various posterior summaries (that’s what we do in the paper linked to above) but this is just a theoretical curiosity.
Thinking about error rates does make a difference, however, if we start selecting procedures based on their Type 1 error rates, Type 2 error rates or whatever. Then I think you’re asking for trouble, for the reasons noted above. Thus these ideas could be useful in pointing us away from theoretical and methodological dead ends.