Here’s a cute one for your intro probability class. Karen Langley from the Wall Street Journal asks: What is the probability of the Dow Jones Industrial Average closing unchanged from the day before, as it did yesterday? To answer this question we need to know two things: 1. How much does the Dow Jones average […]

**Teaching**category.

## To do: Construct a build-your-own-relevant-statistics-class kit.

Alexis Lerner, who took a couple of our courses on applied regression and communicating data and statistics, designed a new course, “Jews: By the Numbers,” at the University of Toronto: But what does it mean to work with data and statistics in a Jewish studies course? For Lerner, it means not only teaching her students […]

## How to teach sensible elementary statistics to lower-division undergraduates?

Kevin Carlson writes: Though my graduate education is in mathematics, I teach elementary statistics to lower-division undergraduates. The traditional elementary statistics curriculum culminates in confidence intervals and hypothesis tests. Most students can learn to perform these tests, but few understand them. It seems to me that there’s a great opportunity to reform the elementary curriculum […]

## He’s looking for a Bayesian book

Michael Lewis wrote: I’m teaching a course on Bayesian statistics this fall. I’d love to use your book but think it might be too difficult for the, mainly, graduate social work, sociology, and psychology students likely to enroll. What do you think? In response, I pointed to these two books that are more accessible than […]

## The virtue of fake universes: A purposeful and safe way to explain empirical inference.

This post is by Keith O’Rourke and as with all posts and comments on this blog, is just a deliberation on dealing with uncertainties in scientific inquiry and should not to be attributed to any entity other than the author. As with any critically-thinking inquirer, the views behind these deliberations are always subject to rethinking […]

## Golf example now a Stan case study!

It’s here! (and here’s the page with all the Stan case studies). In this case study, I’m following up on two earlier posts, here and here, which in turn follow up this 2002 paper with Deb Nolan. My Stan case study is an adaptation of a model fit by Columbia business school professor and golf […]

## Jeff Leek: “Data science education as an economic and public health intervention – how statisticians can lead change in the world”

Jeff Leek from Johns Hopkins University is speaking in our statistics department seminar next week: Data science education as an economic and public health intervention – how statisticians can lead change in the world Time: 4:10pm Monday, October 7 Location: 903 School of Social Work Abstract: The data science revolution has led to massive new […]

## Columbia statistics department hiring teachers and researchers

Details here. Here are the four positions: 1. The Department of Statistics invites applications for a tenure-track Assistant Professor position to begin July 1, 2020. A Ph.D. in statistics or a related field is required. Candidates will be expected to sustain an active research and publication agenda and to teach in the departmental undergraduate and […]

## Here’s why you need to bring a rubber band to every class you teach, every time.

A student discussion leader in every class period Recently we’ve been having a student play the role of discussion leader in class. That is, each class period we get a student to volunteer to lead the discussion next time. This student takes special effort to be prepared, and I’ve seen three positive results: – At […]

## How to read (in quantitative social science). And by implication, how to write.

I’m reposting this one from 2014 because I think it could be useful to lots of people. Also this advice on writing research articles, from 2009.

## What if that regression-discontinuity paper had only reported local linear model results, and with no graph?

We had an interesting discussion the other day regarding a regression discontinuity disaster. In my post I shone a light on this fitted model: Most of the commenters seemed to understand the concern with these graphs, that the upward slopes in the curves directly contribute to the estimated negative value at the discontinuity leading to […]

## Causal inference: I recommend the classical approach in which an observational study is understood in reference to a hypothetical controlled experiment

Amy Cohen asked me what I thought of this article, “Control of Confounding and Reporting of Results in Causal Inference Studies: Guidance for Authors from Editors of Respiratory, Sleep, and Critical Care Journals,” by David Lederer et al. I replied that I liked some of their recommendations (downplaying p-values, graphing raw data, presenting results clearly) […]

## We’re done with our Applied Regression final exam (and solution to question 15)

We’re done with our exam. And the solution to question 15: 15. Consider the following procedure. • Set n = 100 and draw n continuous values x_i uniformly distributed between 0 and 10. Then simulate data from the model y_i = a + bx_i + error_i, for i = 1,…,n, with a = 2, b […]

## Question 15 of our Applied Regression final exam (and solution to question 14)

Here’s question 15 of our exam: 15. Consider the following procedure. • Set n = 100 and draw n continuous values x_i uniformly distributed between 0 and 10. Then simulate data from the model y_i = a + bx_i + error_i, for i = 1,…,n, with a = 2, b = 3, and independent errors […]

## Question 14 of our Applied Regression final exam (and solution to question 13)

Here’s question 14 of our exam: 14. You are predicting whether a student passes a class given pre-test score. The fitted model is, Pr(Pass) = logit^−1(a_j + 0.1x), for a student in classroom j whose pre-test score is x. The pre-test scores range from 0 to 50. The a_j’s are estimated to have a normal […]

## Question 13 of our Applied Regression final exam (and solution to question 12)

Here’s question 13 of our exam: 13. You fit a model of the form: y ∼ x + u full + (1 | group). The estimated coefficients are 2.5, 0.7, and 0.5 respectively for the intercept, x, and u full, with group and individual residual standard deviations estimated as 2.0 and 3.0 respectively. Write the […]

## Question 12 of our Applied Regression final exam (and solution to question 11)

Here’s question 12 of our exam: 12. In the regression above, suppose you replaced height in inches by height in centimeters. What would then be the intercept and slope of the regression? (One inch is 2.54 centimeters.) And the solution to question 11: 11. We defined a new variable based on weight (in pounds): heavy […]

## Question 11 of our Applied Regression final exam (and solution to question 10)

Here’s question 11 of our exam: 11. We defined a new variable based on weight (in pounds): heavy 200 and then ran a logistic regression, predicting “heavy” from height (in inches): glm(formula = heavy ~ height, family = binomial(link = “logit”)) coef.est coef.se (Intercept) -21.51 1.60 height 0.28 0.02 — n = 1984, k = […]

## Question 10 of our Applied Regression final exam (and solution to question 9)

Here’s question 10 of our exam: 10. For the above example, we then created indicator variables, age18_29, age30_44, age45_64, and age65up, for four age categories. We then fit a new regression: lm(formula = weight ~ age30_44 + age45_64 + age65up) coef.est coef.se (Intercept) 157.2 5.4 age30_44TRUE 19.1 7.0 age45_64TRUE 27.2 7.6 age65upTRUE 8.5 8.7 n […]

## Question 9 of our Applied Regression final exam (and solution to question 8)

Here’s question 9 of our exam: 9. We downloaded data with weight (in pounds) and age (in years) from a random sample of American adults. We created a new variables, age10 = age/10. We then fit a regression: lm(formula = weight ~ age10) coef.est coef.se (Intercept) 161.0 7.3 age10 2.6 1.6 n = 2009, k […]