In response to the discussion of his remarks on assumptions vs. conditions, Jeff Witmer writes:

If [certain conditions hold] , then the t-test p-value gives a remarkably good approximation to “the real thing” — namely the randomization reference p-value. . . .

I [Witmer] make assumptions about conditions that I cannot check, e.g., that the data arose from a random sample. Of course, just as there is no such thing as a normal population, there is no such thing as a random sample.

I disagree strongly with both the above paragraphs! I say this not to pick a fight with Jeff Witmer but to illustrate how, in statistics, even the most basic points that people take for granted, can’t be.

Let’s take the claims in order:

1. *The purpose of a t test is to approximate the randomization p-value.* Not to me. In my world, the purpose of t tests and intervals is to summarize uncertainty in estimates and comparisons. I don’t care about a p-value and almost certainly don’t care about a randomization distribution. I’m not saying this isn’t important, I just don’t think it’s particularly fundamental. One might as well say that the randomization p-value is a way of approximating the ultimate goal which is the confidence interval.

2. *There is no such thing as a random sample.* Hey–I just drew a random sample the other day! Well, actually it was a few months ago, but still. It was a sample of records to examine for a court case. I drew random numbers in R and everything.

Not to mention that uncertainty is a property of your mind not a property of the external world.

Contrary to some other comments, it doesn't surprise me this has generated much interest. It's at the heart of how statistics is done and at the heart of how we talk about how statistics is done. The pivot is the difference between (a) making explicit the assumptions (axioms, postulates) on which pure mathematical work depends and (b) making claims about how data are generated "out there".

A more brutal word might better underline their character. I don't think "conditions" fits the bill. Assumptions are usually assertions, although the chance of that supplanting common usage is equally small.

I'll say it bluntly:

The purpose of the two-sample t-test (with unequal variance) is to compare two population means.

Disputes???

Of course there exists a random sample. I think what Jeff Witmer is getting at is how representative a sample is that of the target population. You just have to clearly state your target population when you draw the sample. For example, if I sample students at my university for a study, I would only generalize the results to my local university.

The algorithm that R used to choose your random numbers wasn't a truly random process.

Even if your sample of records was a random sample (and it cannot possibly have been, because random samples, like normal distributions, by definition do not exist in the real world) it was a finite population sample taking a discrete number of values and the "assumptions" of the t-test are obviously violated. The reason why "assumptions are always untrue" is important is that it logically invalidates all inference that is conditional on the truth of the model. It shifts the emphasis in statistics from reasoning in terms of models to reasoning in terms of techniques.

Andrew:

1) I did not write that the purpose of a t-test is to get a p-value that approximates the randomization p-value. I wrote that the t-test p-value approximates the randomization p-value. If you don't care about the p-value, that's OK. If you do, then what you (or at least most people who perform a t-test) really want is the randomization p-value, but the t-test p-value is a very good approximation. [Aside: If RA Fisher had owned a computer like the one I'm now using, would the t-test, or the ANOVA F-test, be part of the standard intro stat syllabus? I think not.]

2) If you used R, then you did not take a truly random sample. A darned good sample based on pseudo-random numbers, sure. But not a random sample. I'm being pedantic, of course, as I was in my earlier comment! I join you in saying that "even the most basic points that people take for granted, can't be." One is example is that people take for granted that a sampling procedure is "random." For all practical purposes it might be. To a philosopher it can never be – unless God(?) gives us the "random" numbers ;-)

I thank Kaiser (commenting on the earlier post) for reminding me that one of my motivations for replacing "assumption" with "condition" is that, for me at least, it is kind of easy to think of an assumption as a yes/no, true/false kind of thing. The word "condition" connotes, to me at least, something that might be only partly true, but perhaps "true enough" for my purposes. E.g., the validity of my t-test depends on normality and maybe that looks a bit shaky, but my sample size is large, or P is extremely large (or small), so I'm not much worried that the underlying population is a bit skewed.

Jeff

Vinh, a couple other purposes are discussed in

Reichardt and Gollob and their references.

This is why you should use physical random numbers like HotBits. While indistinguishable from PRNGs, they allow you to get all sanctimonious.

Not to return to the theme of the previous post, but it seems that the importance of most untestable assumptions is in imbuing abstract statistical models with real-world importance.

Take the simple case of a conditional linear regression used to estimate the parameters a structural/causal model. In this case "assumption" is entirely valid (and RE: Jeff, never yes/no). We have to assume, for example, no unmeasured confounding. We may have to assume a causal linear relationship between exposure and outcome – and though we can check this in the data, we can never be sure that any non-linearity isn't due to some confounding factor.

C Ryan King: "This is why you should use physical random numbers like HotBits. While indistinguishable from PRNGs, they allow you to get all sanctimonious."

LOL

Re Witmer: I guess I never think of assumptions as yes/no right/wrong. (This may reflect the fact that I don't have that many years of math.) To paraphrase, statistical assumptions are seldom 100% correct and verifiable, but often close enough to right to be useful.

Hotbits is cool, thanks for letting me know about it.

Is there an R RNG that will produce genuine PRNGs, as for example, from hotbits? If not, there should be.

There are <a>commercial hardware RNGs which feed into /dev/random and /dev/urandom. I don't know if R can be instructed to take entropy from there or not, since I've never seen an R application which needs cryptographic-level randomness.