This note on charter schools by Alex Tabarrok reminded me of my remarks on the relevant research paper by Dobbie and Fryer, remarks which I somehow never got around to posting here. So here are my (inconclusive) thoughts from a few months ago:
Harlem Children’s Zone (HCZ) is arguably the most ambitious social experiment to alleviate poverty of our time. We [Dobbie and Fryer] provide the first empirical test of the causal impact of HCZ on educational outcomes, with an eye toward informing the long-standing debate whether schools alone can eliminate the achievement gap or whether the issues that poor children bring to school are too much for educators to overcome.
Their conclusions are extremely positive:
Harlem Children’s Zone is enormously effective at increasing the achievement of the poorest minority children. Taken at face value, the effects in middle school are enough to reverse the black-white achievement gap in mathematics and reduce it in English Language Arts. The effects in elementary school close the racial achievement gap in both subjects. Harlem Gems and The Baby College, the only two community programs in HCZ that keep detailed administrative data, show mixed success. We conclude by presenting three pieces of evidence that high-quality schools or high-quality schools coupled with community investments generate the achievement gains. Community investments alone cannot explain the results.
Here’s how they address the potential concern that kids in the program will be better-prepared than the control group of kids not in these schools:
We implement two identification strategies. First, we exploit the fact that HCZ charter schools are required to select students by lottery when the demand for slots exceeds supply. Second, we use the interaction between a student’s home address and cohort year as an instrumental variable.
Here’s the punch line:
“Winners” here are students who receive a winning lottery number or who are in the top ten of the waitlist.
They also show results for English tests which are positive, but less impressive. They remark that, “Interventions in education often have larger impacts on math scores as compared to [English] scores (e.g. Decker et al., 2004; Rockoff, 2004; Jacob, 2005). This may be because it is relatively easier to teach math skills, or that reading skills are more likely to be learned outside of school. Another explanation is that language and vocabulary skills may develop early in life, making it difficult to impact reading scores later (Hart and Risley, 1995).”
What does this all mean?
I haven’t looked at the statistical details of this paper–that’s hard work!–but I do have a few comments, to be made on the assumption that Dobbie and Fryer’s analysis is essentially correct.
My first comment is that my mindset, before reading this paper, was that more effective teaching methods do exist–KIPP and the like–and that the way they work is by getting the teachers and students to work harder and longer than is usual during the school day. The Dobbie and Fryer paper did not change my view on this; they write “Our rough estimate is that Promise Academy students that are behind grade level are in school for twice as many hours as a typical public school student in New York City. Students who are at or above grade level still attend the equivalent of about fifty percent more school in a calendar year.”
This is not to dismiss the findings–it’s not so easy to motivate teachers and students to work twice as hard–but just to connect these results to other things that I’ve heard.
My second comment is that these schools are described as a way to close the gap between whites and blacks in school performance. But if they’re so effective, maybe they’d be applied to white kids also? Or is the point that these school changes would really only be applied as part of a package of interventions in predominanty-minority neighborhoods? I’d like to hear more about this issue in the Conclusion section of the article, which raises the idea of following up in regular public schools.
Silly little things
Dobbie and Fryer’s paper has excellent graphs–something you don’t always see in work by economists. I’m happy to see that the top economists are presenting their work graphically–this seems like an excellent sign. I just have a couple of minor comments:
I’d prefer if Figure 1 (the map) were shown in a non-distorted way and with more information that is relevant to the study. For example, more information about exactly where the kids live, where the schools are, etc. The existing map is hard to read partly because it is distorted (or so it looks to my eyes), meaning that the distance scale is not so meaningful, also the orange background color makes it hard to see any details at all. Beyond this, the map includes irrelevant information such as the path of the Central Park road; this is the sort of thing that Ed Tufte correctly calls “chartjunk.” In this case, the authors didn’t add the chartjunk; they just put their info on an existing map. Nonetheless, the end result of this otherwise-potentially-useful map is to show nothing much more than that the Harlem Chlidren’s Zone is, indeed, located in Harlem.
Figure 2 is just great. I have only three small suggestions:
– Reduce the y-axis scale. There’s no reason to go all the way from -.6 to +.5; you can restrict to the range of the data, which is from -.4 to +.3. Even a small change like this will help a lot, actually.
– There’s something weird going on with the y-axis. You can’t put “percent enrolled” on the same scale as test scores! That’s like saying that my groceries cost $25 and it’s 15 degrees out, so my groceries are higher than the temperature. Also, you have to be careful with the whole “percentage” thing. Does “.2” on the percentage scale correspond to 0% or to 20%.
– Also, once you get rid of the percentage thing, you can really expand the scale, because the red and blue lines are all between -.4 and .02 on the y-axis.
– Beyond this, how to we interpret a test score of -.2? That doesn’t seem right. I assume that the actual scores are positive, and that this is all explained in the text, but I really think that graphs should be as self-contained as possible.
– The color scheme is great (once you can explain how percentages and test scores fit on a common scale). I’d recommend labeling the lines directly rather than using a legend. Once you fix the scale, the lines will be farther apart also.
– 2003 should come before 2004. In the graph shown, 2004 is on the left and 2003 is on the right, which is counter to the conventional way of displaying time ordering.
I won’t go over the other graphs line by line, except to say that they’re basically fine. I would prefer, however, that they use a consistent color scheme throughout. In Figure 2, blue represents Math score and red represents English score; in the other figures, blue means Lottery Winners and red means Lottery Losers.
And then there are the tables. I think you know already what I’m going to say, so I won’t bother to say it. (I mean, 10.424 with a standard error of 7.167? What are these people thinking?) I know, I know, default choices don’t need to be justified. But, still . . .
It’s worth emphasizing, at this point, that I think the authors present their results very well, both graphically and in the text of their article. It’s only because they took the leap to make these solid graphs, that I can take the next step and try to help them do even better next time. I think one of the roles of a statistician such as myself is to help researchers do their jobs even better–and this is particularly satisfying in settings such as this, where there’s no way I would’ve been doing the research myself.
The last line of the acknowledgments says, “The usual caveat applies.” I have no idea what that means–something in economics-speak? I have noticed in general that econ papers have longer acknowledgment sections than stat papers do. My theory has always been that economists write fewer articles and put more time into each one, whereas statisticians spit out articles at a machine-gun rate and don’t look back. The two fields have different systems: my impression is that in econ, it’s a big deal to be published in the American Economic Review or wherever, whereas, in stat, an article in JASA or Annals of Statistics or wherever won’t necessarily get noticed anyway.