Hey! Here’s a study where all the preregistered analyses yielded null results but it was presented in PNAS as being wholly positive.

Posted on March 24, 2024 9:43 AM by Andrew

Ryan Briggs writes:

In case you haven’t seen this, PNAS (who else) has a new study out entitled “Unconditional cash transfers reduce homelessness.” This is the significant statement:

A core cause of homelessness is a lack of money, yet few services provide immediate cash assistance as a solution. We provided a one-time unconditional CAD$7,500 cash transfer to individuals experiencing homelessness, which reduced homelessness and generated net societal savings over 1 y. Two additional studies revealed public mistrust in homeless individuals’ ability to manage money and the benefit of counter-stereotypical or utilitarian messaging in garnering policy support for cash transfers. This research adds to growing global evidence on cash transfers’ benefits for marginalized populations and strategies to increase policy support. Although not a panacea, cash transfers may hasten housing stability with existing social supports. Together, this research offers a new tool to reduce homelessness to improve homelessness reduction policies.

Based on that, I was surprised to read the pre-registration documents and supplemental information and learn that literally none of the outcomes that the researchers pre-registered were significant. Even the variable that they chose to focus on (days homeless) was essentially the same in the 12 month follow up (0.18 vs 0.17) and, just eyeballing Table S3, it seems the differences were rarely large and not ever significant in any single follow up period.

This is now generating news coverage about how cash transfers work to reduce homelessness (e.g., here and here).

I guess in a sense pre-registration worked because we can see that they did not expect this and had to explore to find it, but what good does that do if the press just reports it all credulously?

I have mixed feelings on this one. On one hand, I don’t like the whole statistical-significance-thresholding thing: if the study found positive results, this could be worth reporting, even if the results are within the margin of error. This within-the-margin-of-error bit should just be mentioned in the news articles. On the other hand, if the researchers are rummaging around through their results looking for something big to report, then, yeah, these results will be massively biased upward.

So, from that perspective, maybe a good headline would not be, “Homeless people were given lump sums of cash. Their spending defied stereotypes” or “B.C. researchers studied how homeless people spent a $7,500 handout. Here’s what they found,” but rather something like, “Preliminary results from a small study suggest . . .”

But then we could step back and ask, How did this study get the press in the first place? I’m guessing PNAS is the reason. So let’s head to the PNAS paper. From the abstract:

Exploratory analyses showed that over 1 y, cash recipients spent fewer days homeless, increased savings and spending with no increase in temptation goods spending, and generated societal net savings of $777 per recipient via reduced time in shelters.

I guess that “exploratory analysis” is code for non-preregistered or non-statistically-significant. Either way, I think it’s irresponsible and statistically incorrect—although, regrettably, absolutely standard practice—to report this “$777” without any regularization or partial pooling toward zero. It’s a biased estimate, and the bias could be huge.

Figure 1 of the paper looks very impressive! This figure displays 35 outcomes, almost all of which go in a positive direction (fewer days homeless, more days in stable housing, higher value of savings . . ., all the way down to lower substance use severity, lower cost of all service use, and cost of shelter use. The very few negative outcomes were tiny compared to their uncertainty. If you look at Figure 1, the evidence looks overwhelming.

But Table 1 does not seem like such a great summary of the data displayed elsewhere in the paper. Looking at Table 3, the good stuff all seems to be happening in the 1-month and 3-month followups without much happening after 1 year.

Here’s what the authors wrote:

The preregistered analyses yielded null effects in cognitive and well-being outcomes, which could be due to the low statistical power from the small participant number in each condition or the possibility that any effect on cognition and well-being may take more than 1 mo to show up.

I agree that these null findings should be mentioned right up there in the abstract. They should also include the possibility that the treatment really has no consistent effect on these outcomes. It’s kinda lame to give all these alibis and never even consider that maybe there’s nothing going on.

What about the housing effects going away after a year? The authors write:

First, the cost of living is extremely high in Vancouver, and the majority of the cash was spent within the first 3 mo for most recipients. Second, while the cash provided immediate benefits, control participants even-tually “caught up” over time.

On the other hand, here’s what they said about a different result:

By combining the two cash and two noncash conditions to increase statistical power, exploratory analyses showed that cash recipients showed higher positive affect at 1 mo and higher executive function at 3 mo. Based on debriefing, participants expressed that while they were initially happy with the cash transfer, moving out of homelessness into stable housing took substantial efforts and hard work in the first few months, which could explain the delayed effect on cognitive function.

They’ve successfully convinced me that they have the ability to explain any possible result they might find.

The thing that bothers me most about the paper is that the authors don’t seem to have wrestled with the ways in which their results seem to refute their theoretical framework. Their choice of what to preregister suggests that they were expecting to find large effects on cognitive and subjective well-being outcomes and then maybe, if they were lucky, they’d find some positive results on financial and housing outcomes. I guess their theory was that the money would give people a better take on life, which could then lead to material benefits. Actually, though, they found no benefits on the cognitive and subjective outcomes—when I say “no benefits,” I mean, yeah, really nothing, not just nothing statistically significant—but the money did seem to help people pay the rent for the first few months. That’s fine—there are worse things than giving low-income people some money to pay the rent!—; it’s just a different story than what they’d started with. It’s less of a psychology story and more of an economics story. In any case, yeah, further study is required. I just think that they could get the most from their existing study if they thought more about what went wrong with their theory.

18 thoughts on “Hey! Here’s a study where all the preregistered analyses yielded null results but it was presented in PNAS as being wholly positive.”

Adede on March 24, 2024 10:30 AM at 10:30 am said:

A one-time cash payment of 7,500 providing benefits for 3 months but not a year seems like a reasonable finding. It’s almost along the lines of “patrons were less hungry after leaving the restaurant, but hungrier at the 12 hour followup.”

Reply ↓
- Daniel Lakeland on March 24, 2024 10:41 AM at 10:41 am said:
  
  Right? Duh…
  
  Also, the cognitive effects might take time? Duh… I’m sure every PTSD suffering veteran is immediately cured by their first visit to a therapist? Every impoverished child immediately cured of anxiety by a Big Mac? /s
  
  The story here isn’t really about giving money to poor people. There have been hundreds of studies on that and they all show the same thing.
  
  The study here is about giving grant money to researchers and what a tremendous waste that is because we’ve forgotten as a society how to do good science.
  
  Reply ↓
- Joshua on March 24, 2024 11:36 AM at 11:36 am said:
  
  Along those lines, there’s this study in contrast that shows quite positive results (no idea if it is similarly flawed – the use of “miracle” could certainly be a red flag):
  
  Roughly 100 unhoused individuals across Los Angeles County and parts of San Francisco were given $750 a month over a year. The money came in the form of an unconditional payment—no strings attached.
  
  https://dworakpeck.usc.edu/sites/default/files/2023-12/Miracle%20Money_Nov%202023_FINAL_12.5.23.pdf
  
  Reply ↓
  - Dale Lehman on March 24, 2024 11:55 AM at 11:55 am said:
    
    The proliferation of pie charts is not a good sign to begin with. I skimmed over some of their paper (linked in the executive summary you linked) and 30% of the participants were lost to the study – they say “There may also be differential retention rates because people receiving basic income may be more likely to complete follow-up surveys as compared to the waitlist or Miracle Friend only groups.” I don’t understand why they need to speculate here – they know who followed up and who did not – why not report the fractions in each group that were lost? I couldn’t find that anywhere in the paper.
    
    Not impressed by what I see, but as Daniel suggests, at least the researchers got the money to do the study, and they’ll get the pubs and citations that ensue.
    
    Reply ↓
Shravan Vasishth on March 24, 2024 11:10 AM at 11:10 am said:

An important point to notice in this paper is the following statement:

“To protect participant privacy, data are not publicly available but will be made available upon reviewer request.”

Apparently they do not know how to anonymize data; why would I trust their data analysis skills?

Moreover, in my many years of experience, “will be made available on request” is a code phrase for “you are never going to get this data”. I have tested this out with several papers, asking both the authors and the action editor for their data (I didn’t even get to asking for code) but never heard back.

Perhaps someone on this blog wants to test their promise to release data.

Reply ↓
Jessica Hullman on March 24, 2024 1:33 PM at 1:33 pm said:

At least they were up front about the fact that their main analysis was not preregistered. I’ve encountered multiple instances lately where people deviate from their plan, sometimes substantially, but continue to describe their analysis as preregistered. As though trying alone is enough to get the rigor points they perceive the label to bring.

Reply ↓
- Jay Patel on March 24, 2024 6:28 PM at 6:28 pm said:
  
  It’d be good to keep a public record of that (not to name and shame, but to spread the word that preregistration doesn’t equal truth).
  
  Reply ↓
Jonathan (another one) on March 24, 2024 2:16 PM at 2:16 pm said:

“It’s less of a psychology story and more of an economics story. In any case, yeah, further study is required.”

Not to an economist ;)

Reply ↓
OrderUncertainty on March 24, 2024 2:46 PM at 2:46 pm said:

There seems to be related issue in a recent article in the Quarterly Journal of Economics (https://academic.oup.com/qje/article/139/1/1/7220727).

The expensive READI program was found to decrease gun violence in Chicago, but the authors generally find null results on their primary outcomes. They highlight a specific subgroup in the abstract: “[P]articipants referred by outreach workers—a prespecified subgroup—saw enormous declines in arrests and victimizations for shootings and homicides (79% and 43%, respectively) which remain statistically significant even after multiple-testing adjustments.”

The issue is that the pre-registration plan (https://osf.io/zt39c) specified way more subgroups, and the spirit contradicts what’s stated in the abstract:

“We plan to examine program impacts for all study participants, as well as among various subgroups based on participants’ individual characteristics (age, number of prior arrests, number of prior felonies, risk score), neighborhood (East and West Englewood, West Garfield, Austin, North Lawndale), provider, referral pathway, position in network, and different interactions of these subgroups. We will conduct this in the spirit of exploratory analysis, not adjusting for multiple hypothesis tests, as we do not anticipate being powered to detect moderate heterogeneity. Finally, we will also implement machine learning methods to flexibly search for treatment effect heterogeneity in a principled way, again as exploratory analysis.”

The p-value they are referring to in the QJE adjusts for doing only three tests total (Table V of the original article). It seems pretty misleading to be highlighting just the “prespecified” outreach heterogeneity—and its p-value—in the abstract. This study will surely inspire related programs and spending, but the take-aways seem actually pretty sobering about such efforts.

Reply ↓
- Daniel Lakeland on March 24, 2024 8:46 PM at 8:46 pm said:
  
  > but the take-aways seem actually pretty sobering about such efforts.
  
  What do you think the take-aways are? I doubt I’m taking away the same thing you are, so I’d be interested in your take.
  
  Reply ↓
  - [email protected] on March 25, 2024 1:34 AM at 1:34 am said:
    
    Daniel: my reading is that he thinks “the takeaway” is virtually no effect.
    
    Reply ↓
    - Daniel Lakeland on March 25, 2024 2:25 PM at 2:25 pm said:
      
      That was what I thought too. And it’s not the take away I have at all. My own personal model for violent crime is based on the observation that across all countries and across many decades, violent crime rates are exponentially related to income inequality as measured by the Gini coefficient
      
      crime ~ exp(k*gini)
      
      Also, if you create the maximum entropy distribution for a given gini, you’ll find that in the countries with low violence there is negligible density in the vicinity of 0 income, and for the higher gini values where violence goes up exponentially, the density near 0 income goes up exponentially as well.
      
      My own belief is that extreme poverty induces criminal behavior and willingness to commit violence. You can’t get rid of that immediately though, for example by simply giving everyone a UBI, it would take a transition period of 10-20 years of eradicating extreme poverty as you’d see gradual but eventually dramatic reduction in violent crime, because violent crime is committed by males age 15-30 or so. So, if you eliminate extreme poverty, for example by UBI, then over a period of 10-20 years as males age out of the high risk ages, and new males age into them who have not experienced decades of childhood poverty, you will see major declines in violent crime. I do think there would be some immediate effect, but it would be followed by additional effects taking 10-20 years.
      
      This study provides evidence that intensive targeted intervention may have effects on violent crime, but it also suggests that relatively simple short term interventions wouldn’t be effective in the way that my proposed systemic intervention would. It basically completely consistent with my own model so I see it as useful information.
    - Daniel Lakeland on March 25, 2024 2:30 PM at 2:30 pm said:
      
      Interestingly, as gini decreases the incidence of suicide increases… this could possibly be related to lack of opportunities to utilize investment of education and soforth. One reason people commit suicide is when they feel there is no way to really improve their personal situation, and that can be caused by “everyone has the same income no matter what they do” which results in small gini coefficient.
      
      The result of these two trends is that there’s a range of optimal gini, which is somewhere in the 0.25 to 0.35 range, probably about 0.3 being near the bottom of the predicted suicide + violent crime rate. That level of Gini is consistent with the result of taking the US and applying somewhere in the range of 10-20% of GDP towards a UBI.
    - Phil on March 25, 2024 3:53 PM at 3:53 pm said:
      
      If the rate of suicide decreased as gini decreased, I think you could explain that too. I’m tempted to provide such an explanation but it’s really beside the point…which is that I don’t think there’s nearly enough information in this thread (thus far) to conclude anything about a causal relationship between gini and suicide rate, or on the mechanisms for that relationship. But sure, it’s fine material for speculation.
    - Daniel Lakeland on March 25, 2024 6:11 PM at 6:11 pm said:
      
      Phil, absolutely, this is speculation and it’s driven by a substantial amount of data analysis not mentioned here in the thread. The relationship of both suicide and violence with gini is both robust across decades and across all the countries of the world, and follows trends across time within country if I remember correctly. When I showed the graph to a social science professor friend he mentioned it was the strongest relationship he’d ever seen in social sciences. it’s been a year or so since I was looking deeply at this though.
      
      log homicide is well predicted by linear function of gini, and log suicide is also well predicted by linear function of gini. They have slopes with opposite signs. log(homicide+suicide) is well predicted by a parabolic relationship with a gini something like 0.33 achieving the lowest expected value. This gini level is typical of countries like France, Germany, Austria, Switzerland. All those countries have log(homicide) rate about 0, the US has gini about 0.4 and log homicide rate about 2 (so roughly 7x what Germany has), Brazil has a gini of about 0.55 and log homicide rate about 3.5 (roughly 33x Germany). (all of these normalized to the same scale of per 100k/yr before taking the log)
      
      GDP/capita overall is also an important predictor, but if you include gini and gdp/capita you can accurately predict a country’s log(homicide rate) to within about +- 1 for all the countries in the world.
      
      There is very clear mechanism whereby poverty would cause crime and violence, and indeed almost all violent crime in the US is committed by people in impoverished neighborhoods, often related to the drug trade which is extremely exploitive of young men and recruits by promising them a path out of extreme poverty (though typically doesn’t deliver).
      
      The mechanistic relationship to suicide is much less clear. Most suicide comes about through lack of a promising future. In Australia for example, suicide plummeted at the start of the 2000’s as purchasing power parity GDP/capita soared from exporting raw materials and such to China. That growth rate is unparalleled in modern history, and caused a robust decline in suicide which returned after 2008 or so once growth rate declined. Economic opportunity clearly suppresses suicide rates, and opportunity is somehow related to differences in income. If people start their careers at low pay and consistently earn increases in wages through their career, then mechanistically this increases Gini coefficient. This is a kind of “good Gini”. If on the other hand, you start wage slavery at age 18 and stay in it with no real income increases through time, I certainly would expect higher suicide rates in that context compared to the first context. This story isn’t very hard to understand, but it’s not something I have done extensive analysis of yet.
A on March 24, 2024 6:13 PM at 6:13 pm said:

I think the authors also deliberately misrepresent the generalizability of their estimates by obscuring the enormous degree of sample restriction going on here.

“We screened 732 participants from 22 shelters from four shelter organizations across Metro Vancouver. Our preregistered screening criteria were: age 19 to 65, homeless for less than 2 y (homelessness defined as the lack of stable housing), Canadian citizen or perma-nent resident, and nonsevere levels of substance use, alcohol use, and mental health symptoms. These screening criteria were used to reduce any potential risks of harm (e.g., overdose) from the cash transfer. To ensure accurate responses, the screening survey was conducted under a cover story without any mention of the cash transfer. Of the 732 participants, 229 passed all criteria (31%). Due to loss of contact with 114 partici-pants despite our repeated attempts to reach them, we successfully enrolled 115 participants in the study as the final sample (50 cash, 65 noncash; see Table 1).”

So they immediately screen out 70% of potential participants due to chronic homelessness, substance abuse, or mental health problems (in addition to age and residency status, which were presumably much less of a factor in the screening process). I know this is done for ethical concerns but they’re basically getting rid of anyone off the bat who is likely to respond less favourably to the treatment. Then among the sample of people who do pass the screening they lose contact with half of them and can’t enrol them in the study, another obvious concern. They find that giving $7500 to each person in this highly selected sample of homeless people saves $8277 (so $777 net) that would have been spent otherwise.

I would be very interested in the authors’ thoughts on how they would go about designing a large-scale cash transfer program to fight homelessness that is only targeted to the fraction of homeless people that they actually ran the study on, and who are almost certainly selected in a way that produces the positive estimates they were presumably eager to find. Should the government start screening homeless people for substance abuse and withholding potential cash transfers to those who fail? How the hell would that work? Although given the authors’ framing of this a general tool to fight homeless and their lack of engagement with the blatant issues surrounding their sample restriction, maybe the implication is that this policy should be extended to all homeless people? With absolutely no insight into how the non-random 85% of homeless people that didn’t make the cut for their study would respond? Perhaps I’m being a bit harsh but this strikes me as a complete failure of science communication and a gross display of dishonesty on their part. But hey, at least the authors got a PNAS right?

Reply ↓
- Daniel Lakeland on March 24, 2024 8:52 PM at 8:52 pm said:
  
  I think unconditional cash transfers (UBI / Universal Basic Income) is a fantastic tool for a lot of what ails our society. But I don’t think it solves problems related specifically to mental illness and substance addition. It would imho be an excellent idea, but as with current systems for helping the poor, those with chronic mental illness should probably be assigned a payee who manages the money for them. That system exists already, though it’d need revision for scale.
  
  Reply ↓
- Clyde Schechter on March 25, 2024 10:37 AM at 10:37 am said:
  
  This kind of attrition during the recruitment/enrollment process is common and almost inevitable in human subjects research. And even apart from ethical considerations, I don’t see anything wrong with targeting the study to a population of people who have characteristics that make it more likely that the study intervention will be effective among them, so long as you are transparent about doing that and the same selection process would be readily applicable in non-research applications of the procedure. In fact, it would be a foolish study design to permit inclusion of people for whom high risk of intervention failure is predictable.
  
  The attrition due to loss to follow-up, however, is another matter. It may be unavoidable, but it is not generally reproducible across settings and is probably another important reason why interventions that look good in the research settings may fail in the real world.
  
  Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Hey! Here’s a study where all the preregistered analyses yielded null results but it was presented in PNAS as being wholly positive.

18 thoughts on “Hey! Here’s a study where all the preregistered analyses yielded null results but it was presented in PNAS as being wholly positive.”

Leave a Reply Cancel reply