Jeff Witmer writes:

I noticed that you continue the standard practice in statistics of referring to assumptions; e.g. a blog entry on 2/4/11 at 10:54: “Our method, just like any model, relies on assumptions which we have the duty to state and to check.”

I’m in the 6th year of a three-year campaign to get statisticians to drop the word “assumptions” and replace it with “conditions.” The problem, as I see it, is that people tend to think that an assumption is something that one assumes, as in “assuming that we have a right triangle…” or “assuming that k is even…” when constructing a mathematical proof.

But in statistics we don’t assume things — unless we have to. Instead, we know that, for example, the validity of a t-test depends on normality, which is a condition that can and should be checked. Let’s not call normality an assumption, lest we imply that it is something that can be assumed. Let’s call it a condition.

What do you all think?

I don't have a thought yet on the terminology, but normality is DEFINITELY NOT a condition for the t-test to be valid. Recall the CLT! I (and many others) consider the t-test a non-parametric test (requires no distributional assumption) other than existence of the first and second moments in the two populations we are interested in. See Lumley et al 2002 _The importance of the normality assumption in large public health data sets_.

Is the thing you are checking exactly the thing you are checking against, or is it something that you check to see if it matches closely enough so that some rule applies?

An example from physics is the classic "assume we have a spherical cow …". Of course cows are cow-shaped but when you are making a model to see how fast they were running when they fell off a cliff, then this assumption flies, somewhat.

In this case "assumption" is accurate as a word describing what we are doing for me. More than condition.

Could you define assumption and condition?

Might it make sense to distinguish between "conditions" that are checkable by looking at just the data, like normality, and "assumptions" that the model in some sense reflects reality, such as particular linear or non-linear relationships among variables? The latter can be explored using various model selection methods, but to say that you're "checking" doesn't seem right. That is, in the sense that all models are wrong, the "assumptions" are the ways that you're using judgment/prior knowledge/blind luck to find a model that's at least useful.

It's true that you can check that data is distributed normally, but if you're doing that and not literally assuming normality, doesn't that then change the final distribution of the t-test? Sometimes we'll incorrectly accept or reject normality when we shouldn't, messing everything up?

I think that would be confusing since we already "condition on" lots of things which are observations or designed-in facts not assumptions, e.g. conditional on the sample size … .

The problem is not assumptions vs. conditions it is that the results are grabbed onto and the constraints of the model are lost. Using the results for what it is not valid. Reporters and bosses are so tempted to screw up once they get the result that backs up their bias.

The proposal risks confusing 'condition' and 'conditioning', since if you are a Bayesian you are going to condition on data (which are known, not assumed).

I agree with Markk, in that models often do have _assumptions_ that are treated as true in the discussion of the findings. There's also an important conceptual difference between a condition (the model works mathematically iff the distribution is normal) and an assumption (the model has predictive power if we assume people are all rational actors, even though we know this is not entirely true).

I wouldn't want to confuse those things, and in social science, I think it's important to emphasize both that a model relies on assumptions and that assumptions can be either a strength or a weakness of a model.

What I'd really like to see as a reader of social science models and statistical models is a language for which assumptions are most important, i.e. which would most undermine a finding if violated. Perhaps this exists and I've not run into it.

I prefer "assumption" over "condition", simply because "condition" is vague – a condition for what? Often, what we call assumptions are not conditions for the usefulness of a test (As Vinh points out above); the assumptions can be violated, and yet the procedure is useful. There's probably a better word than "assumptions", but "conditions" describes the role they play in statistics even less accurately, I think.

I'm not sure that I understand the original complaint, though being troubled with the word "assumption" resonates with me. It seems to me that "assumption" renders too much reality to what is being assumed.

In finance we make assumptions about the distribution of asset returns. Whatever assumption we make is going to be wrong because we have no idea what the true distribution is. (We do know that they are not normal.) A t-distribution with 6.4 degrees of freedom doesn't seem like a "condition" to me. We are unlikely to have enough data to reject it so testing the condition is futile. It is a working hypothesis, but not an assumption in the sense that it is really believed.

Saying that "doing a t-test assumes the data are a random sample from a normal distribution" is, at best, misleading. Voting for Obama does not assume one is a democrat, and looking left-right-left before crossing the street that does not imply one assumes a particular Poisson model for traffic density. What one really means to say is "the t-test has certain optimal properties if the data are random samples from two shifted normal distributions". This correct mathematical result, as formulated, is of no use for statistical practice, because data are never a random sample from a normal distribution. In fact, except perhaps in some finite population cases, it is not even clear what it means for data to "be a random sample". And even if we did agree on the meaning, there would be no way to ever verify it. What does make sense is saying "in comparing two means a reasonable technique is to divide the difference of means by the pooled standard deviation, and then transform the normalized difference to a p-value using the t-distribution". If one is a statistician, like Gossett, one then goes on by studying the operating characteristics of this technique in various real and artificial situations.

I agree with the concept but (like some other commenters) I don't like the term "conditions." Maybe someone can come up with a better word. I'll think about it.

All I can say is, I never would've thought this entry would've attracted so many comments.

It's not perfect, but 'assumptions' seems the better term, to me.

When making inference, we're often relying on properties of the data that the data have little (if any) ability to verify for us – for a review, see Leeb and Pötscher 2005, Model Selection and Inference: Facts and Fiction.

When there is little or no hope of checking these properties, calling them 'assumptions' seems justified.

Bill's point about confusion with conditioning on variables is also a good one – and it doesn't just apply to Bayesian analysis.

I agree that the terminology can confuse, particularly students, about whether or not "assumptions" need to be checked.

But while statisticians may not be in the assumption game, social scientists (and epidemiologists, my area) are. I can't think of many cases where a model is of much interest (maybe just prediction?) if all of its important assumptions can be statistically verified.

So how bout piling on adjective: testable vs. untestable assumptions. That's the language we seem to use most in causal inference in the observational health sciences.

I also did not expect this entry to draw many comments, but I appreciate them.

Those of us who have been building models and using statistical procedures for a long time are pretty comfortable with the terminology used in the field. My concern with the word "assumption" is that I teach a lot of undergraduates who are seeing these things for the first time but who have recently finished years of mathematics study in which one assumes (and states) whatever needs to be assumed when, for example, constructing a proof. When moving from a mathematics classroom to a statistics or modeling classroom the word "assumption" takes on a new meaning; hence my desire for a new word.

My concern is not limited to the t-test setting; that was just an example. But while I'm here I might note that the CLT provides no rescue when the sample size is small; we do need normality. True, we never have it (All models are wrong…) but we often have something that is close enough. If so, then the t-test p-value gives a remarkably good approximation to "the real thing" — namely the randomization reference p-value. When the underlying distribution is skewed the t-test performs badly. Thus, there are some things that must be met/satisfied/true or nearly true/etc. for us to believe the results of a t-test. Most statisticians call those things assumptions. I call it them conditions.

I make assumptions about conditions that I cannot check, e.g., that the data arose from a random sample. (Of course, just as there is no such thing as a normal population, there is no such thing as a random sample. But I'm not here to argue such points; I'll leave that to my next dinner with a colleague from the philosophy department.)

Although I am a card-carrying Bayesian, I had not considered that "conditioning on the data" might be confused with my proposed use of "condition." But is there really a big difference? "The distribution is theta is beta(7,5) conditional on the observed 6 successes and 5 failures." "The distribution of (ybar-mu)/SE(ybar) is a t with 7 degrees of freedom on the conditions that (a) the underlying population of Y is normal and (b) we have a SRS from that population." In each case, one thing is true given that another thing is true. I like calling the "given" thing a condition. If I can't check the condition, then maybe I'm willing to make an assumption about it being true (I'll assume that there really were 6 successes and 5 failures — whereas you might not need to assume that, if you saw the data being collected with your own eyes; I'll assume a SRS knowing, per some commenters, that all bets are off if that assumption is false, and knowing that having a random sample is more important than having normality).

I prefer "assumptions".

They usually start out as assumptions, in any event. Some can be checked, but often checking assumptions is seeing whether they can be falsified — tests which vary in their power from too weak to too strong.

Example: For a very large sample size, I might get a statistically different result with r=.05, but might be prepared to assume independence anyway. For a small sample, I might find any number of incompatible models which can't be falsified, but I'm going to assume one.

Recently I've been trying to use "approximation" rather than assumption, on the grounds that I don't actually assume, or believe, that the "assumption" is true. A result that depends on "assumptions" invites dismissal on the grounds that the assumptions are not actually correct. In reality, all models depend on various conditions/assumptions, and the validity of the results (in terms of their relevance to reality) hinges on these conditions being close enough approximations to reality for the needs of the model, not some binary true/false distinction.

My current preference is to refer to "aspects" of the "model" I am using as an approximating substitute for knowledge of the real mechanism responsible for the unpredictable variation I am treating as "random."

I have no problem with "assumption," because I guess I don't see how they're different from the proof assumptions you're talking about. Whether you can check them or not is irrelevant. First off, assumptions that are untrue don't necessarily lead to inferences which are untrue; see Milton Friedman's Essay on Positive Economics. It's the usefulness of assumptions that matter — not their truth. There are two points to assumptions — to make results tractable and to gain assent from those to whom you present results. "If you assume X then you must accept my result Y." That's as true in social science as it is in mathematics, altough the impoverished universe of mathematical objects short-circuits the number of assumptions one must make. (Though not necessarily: See Lakatos, Proofs and Refutations). And if someone doesn't want to accept X, either because the data don't support it or because their understanding of the world doesn't support, or just because they want to be contrary, then the argument needs to reformulated with other assumptions to convince the other person.

"What I'd really like to see as a reader of social science models and statistical models is a language for which assumptions are most important, i.e. which would most undermine a finding if violated. Perhaps this exists and I've not run into it."

@Graham Webster: I believe the user of the statistical method needs to know the method WELL, to know the crucial assumptions needed for the method to be valid. For example, in the linear regression context, normality does not need to be assumed. When I present linear regression to students, I never specify normality except when presenting "classical linear regression", in which case I'll always mention that normality is not needed. The crucial assumption in linear regression is that of constant variance, and even when this does not hold, we can have valid inference by using robust standard errors under suitable sample size.

Also to follow up with @Jeff Witmer, yes, CLT depends on large samples. But the statistician should do simulations to know what are suitable sample sizes (what I mean by users should know the method well). In the t-test (with unequal variances) example, 30 observations in each group is more than enough in most situations for the asymptotics to kick in based on my experience.

I think this discussion misses the point. What we need more is sensitivity analysis of assumptions/ conditions / approximations. The fact that a sample can never be "perfectly" random or "perfectly" normal or even approximately normal doesn't matter if the result does not vary significantly when these are violated in practice. I think James A is saying the same thing.

Losing sleep over inconsequential assumptions is a shame. By contrast, some of the most vicious nightmares come from those assumptions/conditions that are almost impossible to check, e.g. iid assumptions, correlation matrix, conditional independence. How we should deal with those is worth studying.

I like assumption because it implies that any model is dependent on them, and thus reminds us that it's just a model we're working with. The old adage (via Box, I believe?) holds, that "essentially all models are wrong, but some are useful". Kaiser's previous comment is spot on: a model is useful specifically when its conclusions are not sensitive to the assumptions made. Checking that a condition holds, as Witmer writes, suggests too much that you really believe the condition is a truth when in fact it's an approximation and all you can "check" is that the condition is not wildly unbelievable. This is more closely called an assumption, IMO.

I think that the word and idea of assumptions is very important, but not for this readership.

There are many many many peopel (e.g. researchers, academics) who use statistical methods without really understanding them. Perhaps they use a real statistical software package, or maybe they just use some of Excel's stuff. But they plug their data in and roll.

The problem is that most of these people are not methodologists and are not at all mindful of the limits of the techniques they are trying to use. Perhaps they had a statistics course at some point, but they remain incredibly unsophisticated about their use of statistics. (I see tenured faculty, graduate student and think tankers like this. I would suggest that most are like this.)

The issue there is that the techniques they are using DO rely on certain assumptions. In order for the simple output to mean what these folks think it means, those assumptions must be met — and rarely are. They are NOT mindful of the ways they are compromising, and what that might do to their statistical inferences.

I like the word "assumption" because it implies that it is something you should check, but rarely do. It implies something that you should be more mindful of than you are. In the context of teaching statistics, I think it is great. As a matter of reviewing other people's work, I think it is great. As a matter writing up my own work…well, if I taking seriously thinking about my methods and the real limits of my statistical inferences, it's not so important, but it doesn't hurt to remind me.

what about "working hypothesis" instead of assumptions? some "assumptions" are axiomatic, and conclusions are a function of them regardless of what the data looks like. others are data-driven. these data-driven ones are the ones that we would like to test (often). although some "bayesians" don't, the "frequentists" have plenty of "goodness-of-fit" tests. perhaps none are perfect, but the point of them (imho) is to test whether the working hypothesis can be rejected given the data. working hypothesis would include iid or exchangeable, plus a statistical model, plus perhaps some approximations. thoughts?