Are High-Quality Schools Enough to Close the Achievement Gap? Evidence from a Bold Social Experiment in Harlem

This note on charter schools by Alex Tabarrok reminded me of my remarks on the relevant research paper by Dobbie and Fryer, remarks which I somehow never got around to posting here. So here are my (inconclusive) thoughts from a few months ago:

Steve Levitt links to this article by Will Dobbie and Roland Fryer on an educational innovation to improve the education of ethnic minority children. Dobbie and Fryer write:

Harlem Children’s Zone (HCZ) is arguably the most ambitious social experiment to alleviate poverty of our time. We [Dobbie and Fryer] provide the first empirical test of the causal impact of HCZ on educational outcomes, with an eye toward informing the long-standing debate whether schools alone can eliminate the achievement gap or whether the issues that poor children bring to school are too much for educators to overcome.

Their conclusions are extremely positive:

Harlem Children’s Zone is enormously effective at increasing the achievement of the poorest minority children. Taken at face value, the effects in middle school are enough to reverse the black-white achievement gap in mathematics and reduce it in English Language Arts. The effects in elementary school close the racial achievement gap in both subjects. Harlem Gems and The Baby College, the only two community programs in HCZ that keep detailed administrative data, show mixed success. We conclude by presenting three pieces of evidence that high-quality schools or high-quality schools coupled with community investments generate the achievement gains. Community investments alone cannot explain the results.

Here’s how they address the potential concern that kids in the program will be better-prepared than the control group of kids not in these schools:

We implement two identification strategies. First, we exploit the fact that HCZ charter schools are required to select students by lottery when the demand for slots exceeds supply. Second, we use the interaction between a student’s home address and cohort year as an instrumental variable.

Here’s the punch line:


“Winners” here are students who receive a winning lottery number or who are in the top ten of the waitlist.

They also show results for English tests which are positive, but less impressive. They remark that, “Interventions in education often have larger impacts on math scores as compared to [English] scores (e.g. Decker et al., 2004; Rockoff, 2004; Jacob, 2005). This may be because it is relatively easier to teach math skills, or that reading skills are more likely to be learned outside of school. Another explanation is that language and vocabulary skills may develop early in life, making it difficult to impact reading scores later (Hart and Risley, 1995).”

What does this all mean?

I haven’t looked at the statistical details of this paper–that’s hard work!–but I do have a few comments, to be made on the assumption that Dobbie and Fryer’s analysis is essentially correct.

My first comment is that my mindset, before reading this paper, was that more effective teaching methods do exist–KIPP and the like–and that the way they work is by getting the teachers and students to work harder and longer than is usual during the school day. The Dobbie and Fryer paper did not change my view on this; they write “Our rough estimate is that Promise Academy students that are behind grade level are in school for twice as many hours as a typical public school student in New York City. Students who are at or above grade level still attend the equivalent of about fifty percent more school in a calendar year.

This is not to dismiss the findings–it’s not so easy to motivate teachers and students to work twice as hard–but just to connect these results to other things that I’ve heard.

My second comment is that these schools are described as a way to close the gap between whites and blacks in school performance. But if they’re so effective, maybe they’d be applied to white kids also? Or is the point that these school changes would really only be applied as part of a package of interventions in predominanty-minority neighborhoods? I’d like to hear more about this issue in the Conclusion section of the article, which raises the idea of following up in regular public schools.

Silly little things

Dobbie and Fryer’s paper has excellent graphs–something you don’t always see in work by economists. I’m happy to see that the top economists are presenting their work graphically–this seems like an excellent sign. I just have a couple of minor comments:

I’d prefer if Figure 1 (the map) were shown in a non-distorted way and with more information that is relevant to the study. For example, more information about exactly where the kids live, where the schools are, etc. The existing map is hard to read partly because it is distorted (or so it looks to my eyes), meaning that the distance scale is not so meaningful, also the orange background color makes it hard to see any details at all. Beyond this, the map includes irrelevant information such as the path of the Central Park road; this is the sort of thing that Ed Tufte correctly calls “chartjunk.” In this case, the authors didn’t add the chartjunk; they just put their info on an existing map. Nonetheless, the end result of this otherwise-potentially-useful map is to show nothing much more than that the Harlem Chlidren’s Zone is, indeed, located in Harlem.

Figure 2 is just great. I have only three small suggestions:
– Reduce the y-axis scale. There’s no reason to go all the way from -.6 to +.5; you can restrict to the range of the data, which is from -.4 to +.3. Even a small change like this will help a lot, actually.
– There’s something weird going on with the y-axis. You can’t put “percent enrolled” on the same scale as test scores! That’s like saying that my groceries cost $25 and it’s 15 degrees out, so my groceries are higher than the temperature. Also, you have to be careful with the whole “percentage” thing. Does “.2” on the percentage scale correspond to 0% or to 20%.
– Also, once you get rid of the percentage thing, you can really expand the scale, because the red and blue lines are all between -.4 and .02 on the y-axis.
– Beyond this, how to we interpret a test score of -.2? That doesn’t seem right. I assume that the actual scores are positive, and that this is all explained in the text, but I really think that graphs should be as self-contained as possible.
– The color scheme is great (once you can explain how percentages and test scores fit on a common scale). I’d recommend labeling the lines directly rather than using a legend. Once you fix the scale, the lines will be farther apart also.
– 2003 should come before 2004. In the graph shown, 2004 is on the left and 2003 is on the right, which is counter to the conventional way of displaying time ordering.

I won’t go over the other graphs line by line, except to say that they’re basically fine. I would prefer, however, that they use a consistent color scheme throughout. In Figure 2, blue represents Math score and red represents English score; in the other figures, blue means Lottery Winners and red means Lottery Losers.

And then there are the tables. I think you know already what I’m going to say, so I won’t bother to say it. (I mean, 10.424 with a standard error of 7.167? What are these people thinking?) I know, I know, default choices don’t need to be justified. But, still . . .

It’s worth emphasizing, at this point, that I think the authors present their results very well, both graphically and in the text of their article. It’s only because they took the leap to make these solid graphs, that I can take the next step and try to help them do even better next time. I think one of the roles of a statistician such as myself is to help researchers do their jobs even better–and this is particularly satisfying in settings such as this, where there’s no way I would’ve been doing the research myself.

The last line of the acknowledgments says, “The usual caveat applies.” I have no idea what that means–something in economics-speak? I have noticed in general that econ papers have longer acknowledgment sections than stat papers do. My theory has always been that economists write fewer articles and put more time into each one, whereas statisticians spit out articles at a machine-gun rate and don’t look back. The two fields have different systems: my impression is that in econ, it’s a big deal to be published in the American Economic Review or wherever, whereas, in stat, an article in JASA or Annals of Statistics or wherever won’t necessarily get noticed anyway.

4 thoughts on “Are High-Quality Schools Enough to Close the Achievement Gap? Evidence from a Bold Social Experiment in Harlem

  1. The caveat is usually something like the following: all opinions and any errors or omissions are due to the authors.

    Also, your impression about AER is quite correct. An AER is a big publication. It is the type of publication that, outside top departments, probably assures you tenure if you have published a few other things in remotely decent journals.

  2. "The usual caveat applies" probably means all mistakes and opinions are their own and not necessarily of others.

    I read through this paper quickly after David Brooks's column about it several months back. I also read the book _Whatever_it_Takes, which is a very well-written account of the Harlem Children's Zone (HCZ). One thing Geoffry Canada, who founded HCZ, discovered after starting his middle school is that scores didn't really improve until kids who had been through all the other early childhood programs (of which there are many, and they start with prenatal education for parents) reached his middle school. Perhaps it's minor, but the authors of the paper seem a bit dismissive of the other programs at HCZ that Mr. Canada emphasizes so strongly elsewhere (even if the lottery students didn't necessarily participate in all of these programs). Overall though it's great to see the HCZ studied rigorously; the program implements so many of the policy recommendations that Jim Heckman and many others have been advocating for.

  3. The above comments about the caveat are correct. Who knows why, but I've never seen a (recent) econ paper without the "all remaining errors are the author's" note.

    As for AER: it's one of the top two publications in economics (along with Econometrica) and a pub there along with some decent second-tier papers will surely land you tenure outside of a top 10 school. Perhaps it's because AERs are scarcer than JASAs? There 5 issues a year and something like 250 authors each year (many fewer articles, since each coauthor is counted here). Many economics departments have 50 researchers. The top 50 surely in total must have a couple thousand.

  4. As a transracial adoptee, I read this study and post with a sense of hope, followed by a greater sense of dread.

    Why? Because I know that White America — remember, that's my family, I'm talking about too — is still overly invested in racialist explanations for the achievement gap. Even if it accepts the structural contributors to observed disparities in educational (or other) outcomes, would be loath to invest its resources in bettering someone else's kids (read: minority children who don't look, speak, or act like them).

    I see this in my own family, where my upper middle-class sibling is fighting to redirect vast public school resources in our wealthy county with huge achievement gaps to accommodate better the unique educational needs of her very bright child, regardless that this would take resources away from basic services needed for other kids in the county who already have less well resourced schools, teachers, and infrastructure, largely because of their demographics. Worse, those additional resources believed to be necessary are ones that the family itself could provide.

    If it is impossible for our blended family to understand that making vast resources available to close, rather than expand, achievement gaps, how can we have any hope that the broader society will?

    If you provide the evidence that you can close achievement gap with appropriate services, it raises an ethical imperative to act. But what we'll find is that society won't act because of a very simple truth: That we remain a skewed society that knows that leveling the playing field will affect adversely our odds of benefiting from the status quo. Imagine their being more, not less, competition to get into the best schools? Of placing our kids onto solid career tracks? Why would even the most liberal of families — such as mine — be willing to give that up when the odds are already rather ugly?

Comments are closed.