Cosma Shalizi (of the CMU statistics dept) and I had an exchange about the role of measure theory in the statistics Ph.D. program. I have to admit I’m not quite sure what “measure theory” is but I think it’s some sort of theoretical version of calculus of real variables. I had commented that we’re never sure what to do with our qualifying exam, and Cosma wrote,
I think we have a pretty good measure-theoretic probability course, and I wish more of our students went on to take the non-required sequel on stochastic processes (because that’s the one I usually teach). I do think it’s important for statisticians to understand that material, but I also think it’s actually easier for us to teach someone how a martingale works than it is to teach them to be interested in scientific questions and to not get a freaked out, “but what do I calculate?” response when confronted with an open research problem.
Here it’s been suggested that we replace our qualifying exams with having the student prepare a written review of some reasonably-live topic from the literature and take an oral exam on it, which would be more work for us but come a lot closer to testing what the students actually need to know.
I agree that it’s hard to teach how to think like a scientist, or whatever. But I don’t think of the alternatives as “measure theory vs. how-to-think-like-a-scientist” or even “measure theory vs. statistics”. I think of it as “measure theory vs. economics” or “measure theory vs. CS” or “measure theory vs. poli sci” or whatever. That is, sure, all other things being equal, it’s better to know measure theory (or so I assume, not ever having really learned it myself, which didn’t stop me from proving 2 published theorems, one of which is actually true). But, all other things being equal, it’s better to know economics (by this, I mean economics, not necessarily econometrics), and all other things being equal, it’s better to know how to program. Etc. I don’t see why measure theory gets to be the one non-statistical topic that gets privileged as being so requrired that you get kicked out of the program if you can’t do it.
Cosma then shot back with:
I also don’t think of the alternatives as “measure theory vs. how-to-think-like-a-scientist” or even “measure theory vs. statistics”. My feeling — I haven’t, sadly, done a proper experiment! — is that it’s easier to, say, take someone whose math background is shaky and teach them how a generating-class argument works in probability than it is to take someone who is very good at doing math homework problems and teach them the skills and attitudes of independent research.
You say, “I think of it as “measure theory vs. economics” or “measure theory vs. CS” or “measure theory vs. poli sci” or whatever.” I’m more ambitious; I want our students to learn measure-theoretic probability, and scientific programming, and whatever substantive field they need for doing their research, and, of course, statistical theory and methods and data analysis. Because I honestly think that if someone is going to engage in building stochastic models for parts of the world, they really ought to understand how probability _works_, and that is why measure theory is important, rather than for its own sake. (I admit to some background bias towards the probabilist’s view of the world.) At the same time it seems to me a shame (to use no stronger word) if someone, in this day and age, gets a ph.d. in statistics and doesn’t know how to program beyond patching together scripts in R.
P.S. I think measure theory should be part of the Ph.D. statistics curriculum but I don’t think it should be a required part of the curriculum. Not unless other important topics such as experimental design, sample surveys, statistical computing and graphics, stochastic modeling, etc etc are required also. It’s sad to think of someone getting a Ph.D. in statistics and not knowing how to work with mixed discrete/continuous variables (see Nicolas’s comment below) but it seems equally sad to see Ph.D.’s who don’t know what Anova is, who don’t know the basic principles of experimental design (for example, that it’s more effective to double the effect size than to double the sample size), who don’t know how to analyze a cluster sample, and so forth.
Unfortunately, not all students can do everything, and any program only gets some finite number of applicants. If you restrict your pool to those who want to do (or can put up with) measure theory, you might very well lose some who could be excellent statistical researchers. It would be sort of like not admitting Shaq to your basketball program because he can’t shoot free throws.