I keep being drawn to thinking there is a away to explain statistical reasoning to others that will actually do more good than harm. Now, I also keep thinking I should know better – but can’t stop. My recent attempt starts with a shadow metaphor, then a review of analytical chemistry and moves to the […]

**Teaching**category.

## Golf example now a Stan case study!

It’s here! (and here’s the page with all the Stan case studies). In this case study, I’m following up on two earlier posts, here and here, which in turn follow up this 2002 paper with Deb Nolan. My Stan case study is an adaptation of a model fit by Columbia business school professor and golf […]

## Jeff Leek: “Data science education as an economic and public health intervention – how statisticians can lead change in the world”

Jeff Leek from Johns Hopkins University is speaking in our statistics department seminar next week: Data science education as an economic and public health intervention – how statisticians can lead change in the world Time: 4:10pm Monday, October 7 Location: 903 School of Social Work Abstract: The data science revolution has led to massive new […]

## Columbia statistics department hiring teachers and researchers

Details here. Here are the four positions: 1. The Department of Statistics invites applications for a tenure-track Assistant Professor position to begin July 1, 2020. A Ph.D. in statistics or a related field is required. Candidates will be expected to sustain an active research and publication agenda and to teach in the departmental undergraduate and […]

## Here’s why you need to bring a rubber band to every class you teach, every time.

A student discussion leader in every class period Recently we’ve been having a student play the role of discussion leader in class. That is, each class period we get a student to volunteer to lead the discussion next time. This student takes special effort to be prepared, and I’ve seen three positive results: – At […]

## How to read (in quantitative social science). And by implication, how to write.

I’m reposting this one from 2014 because I think it could be useful to lots of people. Also this advice on writing research articles, from 2009.

## What if that regression-discontinuity paper had only reported local linear model results, and with no graph?

We had an interesting discussion the other day regarding a regression discontinuity disaster. In my post I shone a light on this fitted model: Most of the commenters seemed to understand the concern with these graphs, that the upward slopes in the curves directly contribute to the estimated negative value at the discontinuity leading to […]

## Causal inference: I recommend the classical approach in which an observational study is understood in reference to a hypothetical controlled experiment

Amy Cohen asked me what I thought of this article, “Control of Confounding and Reporting of Results in Causal Inference Studies: Guidance for Authors from Editors of Respiratory, Sleep, and Critical Care Journals,” by David Lederer et al. I replied that I liked some of their recommendations (downplaying p-values, graphing raw data, presenting results clearly) […]

## We’re done with our Applied Regression final exam (and solution to question 15)

We’re done with our exam. And the solution to question 15: 15. Consider the following procedure. • Set n = 100 and draw n continuous values x_i uniformly distributed between 0 and 10. Then simulate data from the model y_i = a + bx_i + error_i, for i = 1,…,n, with a = 2, b […]

## Question 15 of our Applied Regression final exam (and solution to question 14)

Here’s question 15 of our exam: 15. Consider the following procedure. • Set n = 100 and draw n continuous values x_i uniformly distributed between 0 and 10. Then simulate data from the model y_i = a + bx_i + error_i, for i = 1,…,n, with a = 2, b = 3, and independent errors […]

## Question 14 of our Applied Regression final exam (and solution to question 13)

Here’s question 14 of our exam: 14. You are predicting whether a student passes a class given pre-test score. The fitted model is, Pr(Pass) = logit^−1(a_j + 0.1x), for a student in classroom j whose pre-test score is x. The pre-test scores range from 0 to 50. The a_j’s are estimated to have a normal […]

## Question 13 of our Applied Regression final exam (and solution to question 12)

Here’s question 13 of our exam: 13. You fit a model of the form: y ∼ x + u full + (1 | group). The estimated coefficients are 2.5, 0.7, and 0.5 respectively for the intercept, x, and u full, with group and individual residual standard deviations estimated as 2.0 and 3.0 respectively. Write the […]

## Question 12 of our Applied Regression final exam (and solution to question 11)

Here’s question 12 of our exam: 12. In the regression above, suppose you replaced height in inches by height in centimeters. What would then be the intercept and slope of the regression? (One inch is 2.54 centimeters.) And the solution to question 11: 11. We defined a new variable based on weight (in pounds): heavy […]

## Question 11 of our Applied Regression final exam (and solution to question 10)

Here’s question 11 of our exam: 11. We defined a new variable based on weight (in pounds): heavy 200 and then ran a logistic regression, predicting “heavy” from height (in inches): glm(formula = heavy ~ height, family = binomial(link = “logit”)) coef.est coef.se (Intercept) -21.51 1.60 height 0.28 0.02 — n = 1984, k = […]

## Question 10 of our Applied Regression final exam (and solution to question 9)

Here’s question 10 of our exam: 10. For the above example, we then created indicator variables, age18_29, age30_44, age45_64, and age65up, for four age categories. We then fit a new regression: lm(formula = weight ~ age30_44 + age45_64 + age65up) coef.est coef.se (Intercept) 157.2 5.4 age30_44TRUE 19.1 7.0 age45_64TRUE 27.2 7.6 age65upTRUE 8.5 8.7 n […]

## Question 9 of our Applied Regression final exam (and solution to question 8)

Here’s question 9 of our exam: 9. We downloaded data with weight (in pounds) and age (in years) from a random sample of American adults. We created a new variables, age10 = age/10. We then fit a regression: lm(formula = weight ~ age10) coef.est coef.se (Intercept) 161.0 7.3 age10 2.6 1.6 n = 2009, k […]

## Question 8 of our Applied Regression final exam (and solution to question 7)

Here’s question 8 of our exam: 8. Out of a random sample of 50 Americans, zero report having ever held political office. From this information, give a 95% confidence interval for the proportion of Americans who have ever held political office. And the solution to question 7: 7. You conduct an experiment in which some […]

## Question 7 of our Applied Regression final exam (and solution to question 6)

Here’s question 7 of our exam: 7. You conduct an experiment in which some people get a special get-out-the-vote message and others do not. Then you follow up with a sample, after the election, to see if they voted. If you follow up with 500 people, how large an effect would you be able to […]

## Question 6 of our Applied Regression final exam (and solution to question 5)

Here’s question 6 of our exam: 6. You are applying hierarchical logistic regression on a survey of 1500 people to estimate support for a federal jobs program. The model is fit using, as a state-level predictor, the Republican presidential vote in the state. Which of the following two statements is basically true? (a) Adding a […]

## Question 5 of our Applied Regression final exam (and solution to question 4)

Here’s question 5 of our exam: 5. You have just graded an exam with 28 questions and 15 students. You fit a logistic item-response model estimating ability, difficulty, and discrimination parameters. Which of the following statements are basically true? (a) If a question is answered correctly by students with low ability, but is missed by […]