New course: Street-Fighting Math

I want to teach a course next year based on two books by Sanjoy Mahajan: Street-Fighting Mathematics and The Art of Insight in Science and Engineering. You can think of the two books as baby versions of Weisskopf’s 1969 classic, Modern Physics from an Elementary Point of View. Another book in the same vein is Knut Schmidt-Nielsen’s How Animals Work, from 1972. And of course the recent What If, by Randall Munroe.

I’ve never taught such a course before. My plan would be to go through Mahajan’s two books and intersperse some material of my own on Street-Fighting Stats. Some of the Mahajan’s principles such as dimensional analysis are, I’ve come to believe, particularly relevant in Bayesian statistics. (For some background on ways in which purely mathematical ideas come up in Bayesian modeling, see this paper from 1996 and this one from 2004.) The class would conclude with student projects in which they would apply these ideas to problems of their choosing.

I foresee a few challenges in teaching this material, beyond the usual difficulties involved in starting any new course from scratch.

First, this stuff is not part of the standard curriculum in statistics, or mathematics, or political science or economics, or even physics or engineering. It’s in no sense a “required course.” Thus, as with my class on statistical communication and graphics, I’ll have to attract those unusual students across the university who want to learn something that’s useful but not part of the standard sequence of classes.

Second, the math and physics levels of these two books are pretty high. Mahajan teaches at MIT so that’s not a problem, but just about anywhere else, we’d be hard pressed to find many students who will be comfortable with this level of quantitative thinking about the world. I’m not sure how best to handle this. For example, from section 4.6.2 of The Art of Insight:

From the total energy, we can estimate the range of the 747—the distance that it can fly on a full tank of fuel. The energy is $latex \sqrt{C}mgd$ so the range of d is

$latex d\sim\frac{E_{\rm fuel}}{\sqrt{C}mg}$

where $latex E_{\rm fuel}$ is the energy in the full tank of fuel. To estimate d, we need to estimate $latex E_{\rm fuel}$, the modified drag coefficient C, and maybe also the plane’s mass m.

And he goes on from there. I love this stuff, but for most of the students I see in my statistics classes, this wouldn’t be beyond them, exactly, but . . . it would require a lot of effort on their part to work through, and I’m not clear they’d be willing to put in the work.

So maybe I’d have to attract students with stronger math and physics backgrounds. The trouble is, I don’t know that these sorts of kids are the ones who’d be inclined to take my course.

Another way to go would be to set aside all but the simplest of the physics examples and center the course on statistics and social science, where a little bit of algebra will go a long way. We could start the class with some basic material on scaling for regression models, curvature in nonlinear prediction, and partitioning of variance, and go from there. This approach would more directly serve the students who take my classes, but it has the drawback that I’m no longer following Mahajan’s books, and that would a loss because courses go so smoothly when they follow a textbook. So I’m not quite sure what to do.

P.S. Here’s a bit from the Art of Insight book that’s a bit more relevant to statistics students:

A random and a regular walk are analogous in having an invariant. For a regular walk, it is $latex /t$: the speed. For a random walk, it is $latex /t$: the diffusion constant.

I really like how he put that. I hadn’t thought of a random walk and a regular, deterministic walk as being two versions of the same thing, but that makes sense. And of course a random walk with drift is a bridge between the two concepts, with random and deterministic walks being special cases of zero drift and zero variance.

P.S. More thoughts here.

49 thoughts on “New course: Street-Fighting Math

  1. Off topic, but I always loved the math factoid that at any given time there’s at least one pair of antipodal points on the equator with the exact same temperature.

    One of those things that looks kinda hard to prove, or physicsey to prove, but has an easy “math only” proof.

    • Hmm… I’ve heard about the hairy ball theorem, where you always have pairs of divergence free points on the vector field of say wind-velocity on a sphere, but I think that’s true because of the vector nature of the field.

      I can see that the temperature on the equator is always periodic, and hence if it starts at temperature A and then later exceeds temperature B it has to then come back through B to get back to A. but I don’t see why the points where temperature are the same have to be *antipodal*

      I could imagine constructing a temperature field where that wasn’t the case, like maybe a series of heaters going halfway around the earth… so perhaps you’re mis-stating the result slightly? In any case, pointer to a discussion on this theorem would be appreciated.

      • What if we took theta to be the longitude in radians, and T(theta) to be the temperature at that longitude along the equator? Then let,

        G(theta) = T(theta) – T(theta+pi)

        If G(0) = 0 we’re done. If G(0) is for example positive, then G(pi) must be negative, so if T is continuous, there was a zero somewhere between 0 and pi. I.e. A solution to G(theta)=0.

        • I see, yes, it’s not so much that for any given pair of equal temperature points they have to be antipodal, only that there has to exist at least one pair of antipodal equitemp points, and as long as temp is continuous that seems to be ok. I’ll now be working on infinitely powerful heat pumps to induce discontinuous temperature fields just to muck with this result…

        • Thinking this through reminded me of how much more fun math and physics are than slumming it in statistics. If this had been a stat question there wouldn’t have been a meeting of minds in seconds, rather everyone would still be arguing about it 200 years later.

        • Sadly, there’s some kernel of truth to that.

          Somehow Statistics has this strange Econ like aspect where there’s room for divergent “schools of thought” arguing, not some frontier issue, but the basics (without resolution) for decades. Very unnatural for a “science”.

        • Sadly, it is normative rather than descriptive – how one ought to think about and deal with uncertainty given where you find yourself “observation-ally” (what you know and don’t know).

          Yes, unnatural for a (descriptive) “science” but unavoidable in induction (the greater logic of discovery).

          Math here is just a tool, something that gets repeatedly climbed up but only kicked aside at real risk.

        • I think Keith is right, Statistics itself isn’t a science (you can’t for example do experiments to determine what the “correct” method of statistically analyzing some particular dataset is) it’s something else.

          I view statistics as a generalization of logical argument. Most of it looks like: “if the world works like FOO and I observe BAR then I can figure out that BAZ must be approximately true” a generalization of the more exact “if the world works like FOO and I observe BAR then BAZ is exactly true”

          So the Bayesian vs Frequentist “schools of thought” are a little like classical vs intuitionist logics: https://en.wikipedia.org/wiki/Intuitionistic_logic

          Apparently if you adopt intuitionistic logic you can define a number x which is not zero, but is so small that x*x = 0 (a nilpotent infinitesimal).

          So, it’s no surprise that some people think other people are talking nonsense. They start from different ideas of what the heck they’re talking about in the first place.

          You wind up in a lot of meta-arguments.

        • I often describe (part of) statistics to my math friends in a manner somewhat like:

          In math, we prove “A implies B”. In statistics, we use mathematical theorems of the sort “A implies B”, but we usually don’t know for sure whether or not A is true, so we also have to worry about questions like, “If what we have is close to A, can we conclude something close to B?”

    • This factoid has a sister factoid, equally fascinating, but much much harder to prove:

      At any instant there always exists at least one pair of antipodal points on the earth’s surface with the exact same temperature AND the exact same barometric pressure.

  2. Andrew, I really like both books by Mahajan. I think it would be a mistake to simplify and reduce the mathiness/physicsy-ness of the course. Getting exposure to this kind of thinking will help people with building Bayesian models in social sci etc. If it takes them out of a comfort zone… so be it.

    That being said, may I suggests some kind of non-standard grading policy? Make the students step up to a higher level of thinking than they’re comfortable with, but give them a risk-management strategy for getting an acceptable grade when they are pushed hard enough to fail at some of the tasks.

    Some example ideas:

    1) Have some kind of “core” requirement, which is relatively easier and closer to the student’s “native” level of background. Grade this in such a way that some acceptable level of performance there guarantees an “acceptable grade” overall.

    2) Have a series of “reaching” projects which are harder or outside the normal background level. Add the best of 3 out of 5 or something like that to the core requirement and make it so that an “A” or “excellent” level grade overall can only be achieved by some decent level of performance (maybe B) on at least a few of these “stretches”.

    The basic concept: “if you want a B- you can get it by getting a B+ on all the core stuff and not doing any of the “stretch””. If you do really well on the “core” stuff you can guarantee a straight B. If you want a B+, A-, or A you need to do some of the “stretch” and you need to do at least “B” work on 3 out of 5 of the stretch…

    Something like that makes taking the course and trying hard stuff lower risk, and hence more appealing. And, presumably, that means people will try more stuff and learn more stuff.

  3. I’ve just read the latest offering from one of the “top tier” management journals so I might just be particularly depressed about the state of quantitative reasoning in my field but my sense is that you’d be pretty hard pressed to find many faculty in the social sciences who could deal with the kind of quantitative material you are hoping to teach.

  4. Another nice book in this niche is “Consider a Spherical Cow”, which is oriented towards environmental problems, so might tend to draw in a different group of students. I don’t recall how much material is related to statistics, though. The math level is, I think, lower than some of the other books you mentioned.

    • I initially was enthused about “Consider a Spherical Cow” and its sequel(s?), but ended up not finding it useful for my students. One possible qualification: If I remember correctly, it had a lot of “back of the envelope” (AKA “Feynman”) problems, and having students do some of this type — not necessarily the ones in the book) was good for early problems. (But the idea there was not original — I encountered these in intro physics in college.)

      Another book that was good for plucking occasional problems and ideas from is “Should We Risk IT?” (Kammen and Hassenzahl).

  5. Andy, have you thought about incorporating this material into more conventional classes? I can see this being very good material for a “principles” section of a linear modeling or other applied statistics course. It could give students a sense for how to justify their model choices by insight into a problem rather than, say, an algorithmic search over possible specifications.

  6. ” This approach would more directly serve the students who take my classes, but it has the drawback that I’m no longer following Mahajan’s books, and that would a loss because courses go so smoothly when they follow a textbook.”

    Courses that follow a textbook that assumes a stronger background than the students have are not likely to go smoothly. So I’d say just draw from Mahajan’s books things that are appropriate for the students you have — not just particular problems that he discusses, but applications of his ideas to other problems appropriate for your students. I suspect that the exercise of thinking about how his ideas apply to problems more suited to your students can be of great benefit for you as a teacher, and hence to your future students in other classes as well.

    (My advice is free; double your money back if you’re not satisfied.)

    • Martha,

      I’ve been struggling with this too. I use a book that has trade-offs: 1) I love the perspective; 2) it is 1/3 the price of alternatives; 3) it is way too advanced for my students.

      So I take bits and pieces of it, adjust examples for relevance to my students, and if I don’t cover it in class, it isn’t on the test. It means my lectures are harder to prepare and students have to sort through difficult material in the book to find the topics I’m covering, but I’m not sure that is a terrible trade-off on net – it isn’t so bad to read “hard” material if you know exactly what you need to get out of it and you’ve already seen it presented. This is also my “attendance policy”.

      Then again, I’ve never taught a class where I can just take a useful textbook, follow the provided lectures/notes, and have it all there for me. It sounds nice.

      • Yes, a too-hard book can be used well, but it takes extra instructor effort. (I recall once teaching from a book that wasn’t so much too hard as very poorly written. So I gave the students reading assignments with extensive notes pointing out things they needed to fill in, asking questions to help them test their understanding, etc. Then we spent a fair amount of class time having students give the missing explanations and clarifications. It worked. One student wrote on the course evaluation something like, “She took a bad book and made it into a good book!” But it was a lot of work on my part.)

        • Rahul:

          You gotta be kidding. There are lots of courses for which no good book exists. That’s why we write textbooks. There was no good book on applied Bayesian statistics until we wrote Bayesian Data Analysis. There was no good book on teaching statistics until we wrote Teaching Statistics. And so forth.

        • Tangential to the original question, but I’ve been in the situation where we were taught a course from a textbook because the instructor had written one or someone else from the faculty at that Department had written one.

          Is this situation rare? Anecdotally, I’ve also heard from junior faculty that felt pressured to teach from a textbook that some senior member on the faculty had written although they would have preferred to use another textbook.

          To my knowledge, there’s a lot of “political” aspects that enter text-book choice?

        • Using a textbook written by the instructor makes sense — the instructor probably wrote it based on what and how he/she thinks the course should be taught. (With rare exceptions, royalties for textbooks usually amount to pennies per hour spent writing the book.)

          At large universities, there is often a committee appointed to choose a textbook for a course that has several sections; the attempt at uniformity is understandable, but of course the good intention doesn’t always lead to a good choice.

        • That balance, about how much to supplement the textbook with my own notes or references to specific sub-sections, is hard for me…or I’m just lazy, hard to know. The thing is, I think that sifting through all the stuff that is there in the textbook, even if just looking for the specific stuff I talked about, is a good and helpful exercise (in learning broader things about the field I don’t cover, and learning how to carefully skim texts for relevant information). At the same time, there are clearly student efficiency gains (in terms of minimizing effort to maximize the learning I am demanding they learn) if I point them to each relevant paragraph.

          This came up recently because I was asked specifically if: a) I would put my lecture notes online (I’m leaning to yes); and b) I would put references to the specific sub-sections that would be covered in the exam (I’m leaning to no).

          It sounds like you came down on the “give them more help” side than I plan to, and I’m a bit worried that you are right, but I still think I want them somehow be exposed to the ideas and examples discussed in the book that won’t be on the exam but will give them a broader and more nuanced understanding of the material.

          Given that I’m pretty sure you have more experience than me at this, I’m open to any advice you have in terms of the optimal level of hand-holding and guidance for undergraduates, at least in the context of a sub-optimal textbook.

        • jrc: The optimal level of hand-holding really depends on the course and the students. The lower the level of the course (and the worse the background of the students), the more hand-holding is needed. But I do believe that having at least some assignments leading to in-class activities that foster student engagement is important in any course.

          For example, in teaching graduate courses regression and ANOVA, since I was aware that most students didn’t really understand hypothesis testing and confidence intervals well after an introductory course, I would give a handout on those for students to read for the second class day, and also give list of statements to classify on how well they described a p-value or confidence interval. Then in class, I would start by having students discuss their answers with a classmate, then proceed to a show of hands for each potential description, perhaps asking individual students to justify their answer. This was a fairly non-threatening way to get them talking and asking questions.

          For an undergraduate class, I would do much more in the way of a study guide and questions to discuss (or present, if the class was small enough) in class.

        • Thanks Martha. I’m gonna think about this and see how well I can adjust things. I’ll let you know what does/doesn’t work this quarter.

        • PS I’ve still got most of my old handouts for statistics courses on the web; feel free to look at them and use anything that fits (There are probably some errors; if you let me know, I might or might not fix them)

          Analysis of Variance: http://www.ma.utexas.edu/users/mks/384E09/M384Esp09home.html

          Regression: http://www.ma.utexas.edu/users/mks/384Gfa08/384G08home.html

          Undergraduate applied statistics for math, etc. majors: http://www.ma.utexas.edu/users/mks/358Ksp06/358Ksp06home.html

  7. A title like “Street-fighting statistics” will hopefully attact many students from across the university. When I taught “Street-fighting mathematics” as a short MIT IAP course (IAP is the January term where faculty and students try out short, experimental courses or even single lectures), it had about 100 students — many due to the title!

    The most important feature of both books, at least as measured by how long it took me to figure out, is the organization around tools or modes of reasoning. Thus, the main topics are not the usual mathematics or science ones like mechanics, drag, differentiation, polynomials, heat, etc. Instead, the main topics are tools that can be used to understand and solve problems across many traditional topics.

    This feature is what I would preserve in “Street-fighting statistics.” So, instead of students’ going through the two books and then your adding statistics material in the last weeks, what about the following:

    1. From the two books, choose the tools most useful for fostering a street-fighting understanding of statistics (that is, knowing roughly how results should come out, without having to calculate everything).

    In the union of the two books, there are 12 tools: nine from _Art of Insight_, six from _SFM_, but three in common (the first three of _SFM_). Thus, you could do roughly tool per week, maybe leaving out one or two tools as time requires.

    2. For each tool, use as much of the respective book’s exposition and problems as would suit your students, and then add statistics material using that tool (and previous tools). Some quick examples of possible material/problems:

    – dimensional analysis (_AoI_ Ch. 5 / _SFM_ Ch. 1): distinguishing probability from probability density (possible when the independent variable has dimensions).

    – easy cases (_AoI_ Ch. 8 / _SFM_ Ch. 2): (1) As you mentioned in the post, transitioning between random walk w/o drift and a regular walk. (2) Proving that P(A&B) != P(A)*P(B) using A=B as the easy case.

    – lumping (_AoI_ Ch. 6 / _SFM_ Ch. 3): estimating area of Guassian tails (_SFM_ Problem 3.38 on p.55).

    – abstraction (_AoI_ Ch. 2): sufficient statistics

    – proportional reasoning (_AoI_ Ch 4): birthday paradox explained by estimating no. of pairs (_AoI_, Sec. 4.4)

    – spring models (_AoI Ch. 9): the quadratic exponent in the normal distribution

    3. Order the tools (i.e. the units) so that the statistics material depends only on prior material, not on upcoming tools. (This step was difficult, I found.)

    Adding statistics examples to the examples already there should help students learn the tools, because they will get, in the jargon of learning theory, “variation of practice” (thus forcing them to focus on the common deep structure, which is the tool, rather than on accidental surface structure).

    -Sanjoy

Leave a Reply to Jacob Cancel reply

Your email address will not be published. Required fields are marked *