“What, are you nuts? We don’t have time in AP Stats to explain to students what stats actually means. We have to just get them to grind through the computations.”

Andrew Vickers (see here and here) had this fun story about the book he wrote a few years ago, “What is a p-value anyway?” He writes:

Early on, my editor was browsing listservs and came across an AP Stats teachers group that was raving about the book (“p.43 really helped me understand confidence intervals etc etc”). So he wrote to a couple of the teachers saying, “Glad you like the book, would you assign it for your class?”, in an attempt to sell books. They wrote back and said, “What, are you nuts? We don’t have time in AP Stats to explain to students what stats actually means. We have to just get them to grind through the computations.”

Ouch!

P.S. My blurb on the Vickers book: “It’s friendly, accessible, and readable. I like it a lot.”

47 thoughts on ““What, are you nuts? We don’t have time in AP Stats to explain to students what stats actually means. We have to just get them to grind through the computations.”

  1. Have you ever looked at Reddit r/AskStatistics or r/statistics. I figure this gives a good look at what grad students do with stats. It’s truly horrible. Not every single post but the frequency with which someone post a message like “what test can I do to prove my research hypothesis is significant?” Is truly sad. Also the answers, which are usually “the x y z tests are great” and not a long explanation of how testing really works and what it means or advice to build generative models and such.

    Until we replace the INTRO stats course with a Bayesian course we are doomed I think.

    • I feel you’re going to strongly disagree with this, but I don’t think **INTRO stats** can be replaced by a Bayesian course–Bayesian material isn’t nearly accessible enough to a substantial proportion of individuals interested in (or required) to take stats. Keep in mind the majority of folks that take an INTRO stats course do not go on to get their PhDs (or even Master’s) in mathematics or applied statistics.

      I don’t think there will be a shift toward Bayesian stats until the material is written to be a little more accessible to people. For example, Bayesian Data Analysis is a textbook that is not accessible to many people starting their journey into statistics. (Sorry, just an opinion.) It’s probably very accessible to Daniel Lakelands. However, there are many non-Daniel Lakelands that take INTRO stats. The notation in Section 1.3 alone is enough to deter and frustrate many beginning students. At many places, calculus isn’t a pre-req for INTRO stats! I think some of the authors of Bayesian textbooks write for people that are like themselves (which is fine) but I think more authors should try to write for people that are not like themselves, since these are the people that need to be turned on to the Bayesian workflow to begin with. I think some of the well-known Bayesian textbooks could benefit from a complimentary textbook that is more accessible to folks–sort of the way Tibshirani et al do things with Elements of Statistical Learning vs An Introduction to Statistical Learning.

      • Maybe intro to stats should not be about Bayesian or Frequentist but about understanding basic concepts like variance, randomness, sampling, data visualization, etc. By the way, many people who do Bayesian inference do not have a “PhDs (or even Master’s) in mathematics or applied statistics”. And there are books (like Statistical Rethinking ) that teach Bayesian theory and practice and are accessible to highly non-technical people such as plant ecologists and agronomists (my personal experience) without dumbing it down.

        • I have longed to rebuild the intro stats course to focus on variance, sampling, measurement, etc. for most of the time I’ve been teaching stats. But there’s another big barrier: Everyone *else* expects intro stats to cover the Usual Frequentist Tests. If students came out of my class without knowing “how to do a t-test”, there would be Words. (Not to mention anyone who looked at that course description would think it’s a breadth course aimed at arts students, sadly.)

        • Rob:

          You could start with Regression and Other Stories. We don’t have any p-values there. It’s not an intro book but maybe it could inspire you to set up a similar course at an intro level.

      • I think I disagree. As someone with a generally non-mathy back ground who has learnt both NHST and Bayesian stats, I generally think that the Bayesian approach is actually far more intuitive and comprehensible to most people. Of course it depends on the educational material you have available, but to most people coming naively to stats the idea of a sampling distribution is far more arcane than the idea of getting to a posterior distribution by combining a likelihood and a prior. I think Jaynes makes a compelling argument that our natural tendency in making inferences is pretty Bayesian, for example.

        • Daniel, Unanon and Erling all make good points. We need textbooks that are suitable for people who are not excessively “mathy”, and we need teachers who can teach this material. (I had hoped to be able to provide links to some examples, but links from my homepage seem not to be working; I might be able to retrieve some materials by looking through old backups, but that would take a lot of time.)

      • I agree with Unanon that an intro text is needed, I disagree that the material is fundamentally hard. The material is no harder than college algebra really, particularly if you do the calculus using nonstandard analysis methods.

        I think the bigger problem is the one RobMac mentions you have to take a stand and say no to t-tests and Wilcoxons and such. The intro stats course should be renamed to introduction to mathematical modeling, should involve problems like the Stan golf model and whether you get more wet running or walking in the rain, and experiments with paper helicopters, reading in data from the ACS and figuring out how much the US spends on housing per month, determining the viscosity of honey as a function of temperature, looking at historical attitudes towards some social question on the GSS, determining the drag coefficient on a sphere from wind tunnel data, measuring the acceleration due to gravity by dropping pingpong balls, optimizing the cost to carry out some medical experiment and soforth.

        Exactly zero t-tests, plenty of algebra, A little calculus, actual experiments with real physical objects and measurement gear, some minimal computer programming, lots of graphing. And it should be a 4 credit course with a lab.

        • Actually I think it is much harder, statistics requires one to connect the math to the real-world and move from what happened this time (the sample) to learning something about the real-world that generated that data and would generate similar data in the future. College algebra has very few of those concerns.

          It is only with that sense that “understanding basic concepts like variance, randomness, sampling, data visualization” is purposeful. The simple bootstrap can provide some initial sense but for real questions very subtle understanding of reference sets and resampling schemes is required.

          Some interesting discussion of the logistical challenges and barriers here – Mine Çetinkaya-Rundel | Advancing Open Access Data Science Education https://www.youtube.com/watch?v=UDkUNS5GqGs

        • Perhaps the idea of how science works is indeed hard for people but my point was we don’t need to introduce complicated math as well. For example vector fields, attractors, measure theory, high dimensional numerical linear algebra, stochastic differential equations etc

          It’s sufficient for a first course that people can for example multiply cross sectional area by distance traveled to get a volume swept, or know the equation of a line, or of a parabola. Or write a loop that implements Euler’s Method for an ODE.

          This way we force the students to engage the scientific concepts and the link between the model and the reality, without struggling to understand the formal parts.

        • Agree, but the struggling to understand the formal parts is not the only barriers.

          From my experience, take those away and people still struggle to get the scientific concepts and the link between the model and the reality. I was surprised the first few times.

        • That’s ok, ideally this is what they should be struggling with instead they struggle with using the table in the back of the book to look up the appropriate percentage point of the t distribution with 17 degrees of freedom, and that’s POINTLESS.

          20 Years ago I encouraged my sister to take a junior college stats class. She got an A but came out of it with zero useful knowledge. She got the A because she could carry out the rote t-test or linear regression calculations by hand on a test. She never was asked to enter data and make a histogram, or plot T vs C data and estimate the value of the solution temperature given the conductivity or estimate the amount of fertilizer that maximizes the tomato crop yield or determine the most likely quantity of lead in the drinking water as a function of season for each of the different water sources… Or anything like that.

        • “…statistics requires one to connect the math to the real-world and move from what happened this time (the sample) to learning something about the real-world that generated that data and would generate similar data in the future”

          Here we have a very strong disagreement.

          In my view, making predictions is the province of **science** not **statistics**. You can be the best statistician in the world but if you don’t understand the fundamental mode by which – for example – a virus is transmitted through a population, you can’t model it with statistics, no matter how hard you try.

          Statisticians, data analysts, and a whole bunch of other people with a spreadsheet or the latest version of R are forever inserting themselves into things they know nothing about, running models constrained by purported statistical theory or worse yet by their own imaginations rather than by scientific knowledge, and screwing things up badly. The pandemic is just the latest example.

          (in the pandemic, “other people” has included many public health officials and “data epidemiologists” – people who at first glance appear to be experts but in the end have no expertise the critical knowledge set)

        • Jim, there’s a reason ET Jaynes named his book “Probability Theory: The Logic Of Science”. Statistics shouldn’t really be its own separate discipline at the undergrad level (or even Masters level), and I think a major reason why it is is because of Bernoulli’s Fallacy: that we need to consider the sampling distribution of statistics and hypothesis testing etc. all that requires either tremendous computing power (available only in the last 20 years of the world’s history) or quite sophisticated analytical mathematics.

          But what’s needed is to teach “Experimental Design and Data Analysis for Science and Engineering” and “Experimental Design and Data Analysis for Social Sciences and Humanities”.

          The main difference between the two would be the kinds of experiments done. Although not exclusively, In the first case you’d work on problems from physics, chemistry, mechanics of materials, toxicology, ecology, medicine, geology, hydrology etc… And in the second case you’d focus on issues related to survey sampling, economic measurement, education, archeology, political science, etc.

          That’s why I say it should be a 4 credit class with a lab. People should be coming to classrooms to see worked examples, learn about theory, and ask related questions, and to labs to take measurements with instruments, calibrate things, make graphs, and write up models to be fit (or in the social sciences version, downloading datasets from the census/GSS/BLS etc, setting up SQL databases, writing up questionnaires that’d be given out to other sections of the class, analyzing datasets from previous years surveys etc.

          We are just 100% focusing on the wrong things in first or second year stats class. It’s not even in the neighborhood of the right thing. Not even close.

        • Daniel, I essentially agree. A couple years ago, in the hazy times pre-Covid, I piloted a class along these lines using this text, and working up to implementing the case studies contained therein in Stan:
          https://www.amazon.com/Bayesian-Models-Statistical-Primer-Ecologists/dp/0691159289

          It was reasonably successful with my small group, although I will say had its challenges. I think that a 2-semester sequence is probably necessary for most folks to really do any justice to this topic. That said, I still like the basic progression of ideas.

          I haven’t repeated yet for boring and bureaucratic academic reasons, but I may do so again in the future. I have to spend a lot of time with my own advisees helping them unlearn various unhelpful (and sometimes patently incorrect) ideas and methods they’ve picked up elsewhere…

        • Oh, almost forgot! My capstone project was having the students model heat loss out of a jar via conduction using logged data. We estimated the thermal conductivity and tested the ability to forecast held-out data. Super fun! Well for me at least haha. All done in Stan :)

    • Yeah, the questions in r/statistics can be a bit depressing. There does seem to be high-quality discussion about specific topics (rather than specific statistical tests), but it’s hard to imagine stemming the endless tide of questions about one-sample t tests and Wilcoxon signed-rank tests. It’s so much work to try to explain a better analytical approach than NHST than to comment that t-tests assume the data are normally distributed that one despairs of ever making a dent that way. It would be a full-time job!

      • 匿,

        The problem is not just with hypothesis testing. I don’t teach any of that crap—I only teach the good stuff—but, even so, it seems that I spend most of my teaching efforts, and the students spend most of their learning efforts, on specifics, not on concepts. I’m not sure what to do about this. When I try to teach concepts, then the students don’t learn the details, and that’s even worse. So I’m not sure what to do.

      • Since when does a t-test assume normally distributed data? A t-test is ordinarily appropriate for the distribution of a sample statistic, such as the sample mean of a continuous variable (conditional on several assumptions) or the difference between two sample means of that variable. The t-distribution, when appropriate, refers to the distribution of the statistic, not the distribution of the data used in the calculation of the statistic.

        • Richard—

          I think I phrased that in a confusing way. That one should apply a t-test when “the data are normally distributed” is a particular piece of advice I’ve regularly observed commenters offer on those forums when that (or similar) questions arise; I didn’t mean to endorse it myself.

          I don’t actually know why people say this. I’m just speculating, but I suspect that the original rationale is something like, if you have a small sample from a distribution that doesn’t seem to be normal, then the distribution of the sample average can’t be expected to be close enough to normal (or the distribution of the sample variance can’t be expected to be close enough to chi-squared) for the distribution of the test statistic to have an approximate student’s t distribution. You’re definitely right, though, that it seems weird.

        • I appreciate your reply to my comment. However, it appears that you still have some difficulty with such a very basic statistical issue. I’m sorry, but even your reply to my comment suggests that perhaps you should not be teaching statistics to others.

        • I think it’s you that is missing the mark.

          匿名 was referring to my original post in which I despair of what kinds of things you read in r/statistics. It is definitely the case that on that forum and other Reddit statistics forums you see people asking questions like “when should I use a t-test” and the people on the forum reply “use a t-test when the data are normally distributed”

          Those forums are very interesting because if you do ask a sophisticated question, you’ll get a sophisticated answer, but if you ask a basic question you’ll get a cookbook answer that’s wrong in too many ways to even engage with.

        • Richard—

          Daniel is in fact making the point I was trying to make (although much more clearly than I did, so I apologize if my comment was confusing).

          The advice that “the t-test assumes the data are normal” (phrased more or less exactly that way) does seem quite common on the statistics subreddits. Here are a few of off-hand examples: https://www.reddit.com/r/statistics/comments/24qj6j/should_i_use_wilxocon_signed_test_or_ttest/, https://www.reddit.com/r/statistics/comments/9qrzu7/is_it_wrong_to_always_use_wilcoxon_tests/, https://www.reddit.com/r/datascience/comments/blvwoe/paired_ttest_or_wilcoxon_signed_rank/. It’s also made its way onto the Wikipedia page for the Wilcoxon signed-rank test: https://en.wikipedia.org/wiki/Wilcoxon_signed-rank_test.

          I still don’t really understand why people are saying this. I haven’t found an obvious source in any of the textbooks I’ve looked at so far. I thought perhaps I was misunderstanding, and that what they mean is that the t-test assumes that the distribution of the sample average is normal. That doesn’t seem consistent, though, with the fact that at least some posters seem to be advising then looking at the distribution of the sample itself to confirm normality.

    • It would be great if grad students in disparate fields were solid statistically but current incentives and hyper-specialization make this implausible. In fact, learning just the minimal amount of (incorrect) stats to get published is probably the correct optimization of their time. This is totally analogous to how non-CS — and frankly also many CS — grad students have very poor command of basic programming or software engineering concepts.

      And I think that’s fine! Just like python, for most grad students “stats” is a tool to achieve an outcome in some other domain and not something they value intrinsically. It’d be more practical to view this as a UX problem and focus on create user-proof tooling where possible. Bayesian intro courses are fine but won’t address the basic issue of incentives.

  2. Based on this post, I decided to buy the book. I went to Amazon.com, entered the title in the search box, and hit return.

    I found the search result somewhat interesting. I’ve uploaded a screenshot (https://imgur.com/a/BvNAoaC)

    Result 1: “The Killer Pile of Dog Poop”, by Gary Champlin Jr.
    Result 2: “What is a p-value Anyway”

    Interesting. Some sort of conspiracy?

  3. Isn’t the curriculum of AP and most college courses established by a certification board? This important: higher level courses need to know what knowledge students already have. Also consistent curriculum allows transfer credit between schools. Most likely any significant change of a core course would mean it would have to be taught under a different name or number and couldn’t technically replace the core course it’s designed to replace, at least not for transfer credit.

    • I believe that the AP curricula are established by some kind of certification board or similar entity. But there are no such “certification boards” that I am aware of for standardization of curriculum or courses. . Some departments may have a designated textbook for a particular course, but many departments allow the instructor to choose the textbook, as long as the syllabus is covered.

      • “But there are no such “certification boards” that I am aware of for standardization of curriculum or courses”

        Hmmm…I don’t think that’s correct. I think if you dig into the details almost every university belongs to some kind of accreditation standard that specifies some content for the courses. I could be wrong on this but I don’t see how uni’s could accept transfer courses without it. They can’t go analyze the content of every course at every other university.

        I don’t think this standard has a much for teeth – instructors have always exercised substantial freedom – but I believe the published course description has to at least pay lip service to it.

        • When it comes to Engineering programs this standards board (ABET) is pretty powerful and would prevent you from doing anything innovative at all.

          It’s powerful in part because it’s virtually impossible to get a license to practice Engineering without having graduated from an undergraduate course that is accredited by the ABET accreditation board. You’re not even allowed to sit for the exam without enough “credits” most of which comes from the undergraduate degree. There’s a side-channel but it’s for people who’ve been working at engineering companies as draftspeople for 20 years or stuff like that.

  4. What, are you nuts? We don’t have time in AP Stats to explain to students what stats actually means. We have to just get them to grind through the computations.”
    Not an AP course, whatever that is but it seems to describe my first stats course taught in the Psych department. This was pre-calculator & computer for the bog-standard undergrad. We needed to memorize the equations for a two-way Anova, etc. I don’t remember any real discussion on what to do with stats but we learned how to look up t-values in the back of the book.

  5. I have taught the intro stat course many times – I still haven’t found the text I want (Andrew, I’m still waiting for you to write it). The most appealing type book for me would be along the lines of Spiegelhalter’s “The Art of Statistics: How to Learn from Data.” But if you look at that book, it wouldn’t serve as a text for the course, at least on its own. It is conceptual – but where are the exercises, step by step instructions, coordination with computing packages, etc? My point is that the problem with intro stats is really the problem with most of higher education. Texts have become user guides, not conceptual thought-provoking invitations to students to examine and understand the world around them. The modern text has so much baggage tied to it (much driven by the movement towards constant “assessment,” measurable “learning objectives,” etc) that conceptual books like Spiegelhalter’s could only be used as an adjunct to a “real” statistics text. But by the time students have to learn from a “real” text, the value of the conceptual readings is all but lost.

    In theory, it doesn’t have to be that way. But for most students – taking 5 courses at a time, trying to enjoy college, putting up with required general ed courses (I’m not against these – but the reality is that they are more of a full-employment act for disciplines than a coherent development of critical thought), with large classes and the need for everything to be measurable – they are lucky to escape college without their interest in learning becoming destroyed in the process.

    Sure, I overstate. The exceptions (particular colleges, instructors, students) are numerous. But the reality for most students is closer to what I describe, in my opinion. Intro states is but one example of a far larger problem. But I’d suggest that intro stats is perhaps the canary – the difficulties for that course to achieve what we think it should do highlights issues associated with most undergraduate education. They are (arguably) more serious with statistics due to the difficulty and depth of the subject matter and the numerous competing demands put on this one statistics course required for so many different programs.

    • I think Dale has it right. The signaling component of undergrad education has reached nearly 100% the course content has ossified because what matters is that you have taken that course and the employer knows what that means for your willingness to sit behind a desk and calmly ruin the lives of medical patients as an insurance desk jockey refusing coverage of a major medical procedure according to the rules described by your boss. That you be knowledgeable about the world and able to evaluate evidence is not only unneeded but counterproductive to the business. Noone who had taken an ethics course from the philosophy dept and could evaluate medical research thanks to their stats classes, and had knowledge of the economic conditions created by modern policy through substantive economics classes would ever work for a health insurance company, pharma marketing department, credit card issuer, fracking company, military contractor, or surveillance tech company… And what’s left of the modern economy after we eliminate those?

    • ‘(much driven by the movement towards constant “assessment,” measurable “learning objectives,” etc) ‘

      I suggest that the decline of textbooks to instruction manuals is driven by the **reaction** to rigorous student assessment, not by the assessment itself.

      Student assessment isn’t new. Instructors have always given tests to conduct student assessement. What’s new about assessment in the last few decades is standardization and automation. Today faculty and students can’t fudge the results or brown-nose their way out of a poor performance. The result is parent / instructor / student demand for an “education” method – including text books and tests – that make poor performance impossible. Now we can have “rigorous” assessment and fail-proof education and assessment at the same time! Win win! :)

      • What world are you living in? I won’t make any claims about K-12 education, but in higher education assessment has done none of what you claim. It’s not rigorous, unless you count numbers = rigor. Poor performance has not been wiped out – arguably, it is more prevalent than ever. Most of my “evidence” is anecdotal, but I have a LOT of anecdotal experience. There is one grain of truth in your comment (I admit you usually have one): the reaction against assessment is nothing to be proud of. Academics don’t like to be evaluated, and I won’t defend that position. But that does nothing to support current assessment practice as promoting educational quality.

  6. Holy moly, lousy teachers!? And on a topic that actually matters and is challenging? Knock me over with a feather.

    At least we know they’re heroes.

  7. We don’t have time in AP Stats to explain to students what stats actually means.

    So you teach them to “shut up and calculate”, they develop their own ideas of what it means, and chaos ensues.

    Now where have I seen that before…

    • In grad school (biomed) stats it is the same. I still remember sitting there thinking “Why are we testing this null hypothesis rather than a scientific hypothesis?”

      But was so busy with other stuff I just filled in the answers to pass the class (which was treated as a blow-off). Fast forward ~4 years when I had actual data and motivation/time to look into why it was done that way. No explanation made sense and most did not even address my concerns, I thought I must be nuts. Until I found Paul Meehl, who explains it all very clearly.

  8. I think, for high schoolers, it might be better to drop the statistics and just focus on probability. most of the class should just be conveying probability/expected value intuition by teaching the kids how to gamble.

    • Probability theory didn’t really become a thing until dice could be made with enough precision to be close to fair. Then there was a demand for it from wealthy gamblers. Perhaps (if it is possible to do in a few days) you should start with having them make their own unequally weighted dice.

Leave a Reply

Your email address will not be published. Required fields are marked *