No, I don’t believe etc etc., even though they did a bunch of robustness checks.

Dale Lehman writes:

You may have noticed this article mentioned on Marginal Revolution, https://www.sciencedirect.com/science/article/abs/pii/S0167629619301237. I [Lehman] don’t have access to the published piece, but here’s a working paper version. It might be worth your taking a look. It has all the usual culprits: forking paths, statistical significance as the filter, etc etc. As usual, it is a complex piece and done “well” by many standards. For example, I had wondered about their breaking time into during stock market times and before the market opens – I thought they might have ignored the time zone differences. However, they did convert all accident data to the Eastern time zone for purposes of determining whether an accident occurred before the market was open or not.

The result – that when the market goes down, fatal driving accidents go up – with a causal interpretation, may be correct. I don’t know. But I find it a bit hard to believe. For one thing, a missing link is whether the driver is really aware of what the market is doing – and then, the link the paper explores is for the stock market performance for the day with the fatal accidents. But the market often is up for part of the day and down for others, so the intra-day variability may undermine what the paper is finding. Perhaps stronger market drops occur on days when the market is down for larger portions of the day (I don’t know, but potentially that could be explored), but I don’t see that they examined anything to do with intraday variability. Finally, the data is not provided, although the source is publicly available (and would probably take me a week to try to match up to what they purported to use). Why the standard can’t be to just release the data (and, yes, the manipulations they did – code, if you’d like) for papers like this. Clearly, they do not expect anyone to actually try to duplicate their results and see if alternative analyses produce different results.

And, the same day, I received this email from James Harvey:

I can’t access this one and the abstract doesn’t give a clue about the methods but the headline claims seem *HIGHLY* unlikely to be true.

A huge majority of Americans don’t know or care what’s going on at opening in the stock market on any given day. For the extremely small percentage who do know and care to cause a detectable rise in traffic fatalities is absolutely preposterous.

The paper claims that these fatalities are caused by “one standard deviation reduction in daily stock market returns”. Exactly what this means isn’t clear to me right off hand, but considering the distribution of ups and downs shown in the diagram here, it looks like one standard deviation is about 1%. Hardly jump-off-the-bridge numbers.

I agree, and I agree.

As we’ve said before, one position we have to move away from is the attitude that a social-science claim in a published paper, or professional-looking working paper, is correct by default. The next step is to recognize that robustness checks don’t mean much. The third step is to not feel that the authors of a published paper, or professional-looking working paper, are owed some sort of deference. If news outlets are allowed to hype these claims without reanalyzing the data and addressing all potential criticisms, then, yes, we should be equally allowed to express our skepticism without reanalyzing the data and figuring out exactly what went wrong.

On the plus side, the commenters on that Marginal Revolution post are pretty much uniformly skeptical of that stock-market-and-crashes claim. The masses are duly skeptical; we now just have to explain the problem to the elites. Eventually even the National Academy of Sciences might understand.

P.S. Yes, it’s possible that good days in the stock market cause safer driving. It’s also possible that the opposite is the case. My point about the above-linked paper is not that it has striking flaws of the pizzagate variety, but rather that we should feel under no obligation to take it seriously. It’s a fancy version of this.

If you are interested in the topic of the psychological effects of stock price swings, then I’d recommend looking at lots of outcomes, particularly those that would effect people who follow the stock market, and move away from the goal of proof (all those robustness tests etc) and move toward an exploratory attitude (what can be learned?). It’s not so easy, given that social science and publication are all about proof—maybe it would be harder to get a paper published on the topic that is frankly exploratory—but I think that’s the way to go, if you want to study this sort of thing.

At this point, you might say how unfair I’m being: the authors of this article did all those robustness checks and I’m still not convinced. But, yeah, and robustness checks can fool you. Sorry. I apologize on behalf of the statistics profession for giving generations of researchers the impression that statistical methods can be a sort of alchemy for converting noisy data into scientific discovery. I really do feel bad about this. But I don’t feel so bad about it that I’ll go around believing claims that can be constructed from noise. Cargo-cult science stops here. And, again, this is not at all to single out this particular article, its authors, or the editors of the journal that published the paper. They’re all doing their best. It’s just that they’re enmeshed in a process that produces working papers, published articles, press releases, and news reports, not necessarily scientific discovery.

31 thoughts on “No, I don’t believe etc etc., even though they did a bunch of robustness checks.

    • Turns out, via a birthday present of a couple of years ago, I own a hard copy of the book. Yes this is a very amusing achievement. However, as far as I can tell, each of the graphs usually has about 10-20 data points and some only about five, with the x-axis always being time. The highest correlation I found was on page 131, 99.4%–“LEGO revenue vs Worldwide revenue from commercial space launches”. Somewhere in the book there may well be a higher correlation but a quick calculation indicates it can’t be much higher. However, see page 78-79.

    • There is a simple intuitive explanation for the study. The stock market and car crashes are both correlated with good and bad news in general. If there is a natural disaster or war, or a major bankruptcy, those things hurt the stock market and also affect driving behavior.

  1. I think that a paper such as this should start with an explanation of what they think is a plausible mechanism of causality.

    Then you can work on robustness checks of the theory of causal mechanism.

    For example. If there is an effect of the daily fluctuation of the market, is there an effect when the market is down for a week, a month, a year? If not, how would the theory of causal mechanism predict that difference?

    • Basically, these theorized causal relarionships should be run through a version of Hill’s Criteria for causation. For example, they should be subjected to a “longitudity test.” They should be tested for a “dose-response” test.

      Cross-sectional data that point to a correlation is not sufficient to conclude causation.

    • Joshua:

      The causal mechanism is clear, right? People are upset about the stock market dropping so they drive less carefully. The trouble is that there are a lot of car crashes in this country, the vast majority of which we can assume are entirely unrelated to the stock market. So we have a kangaroo problem.

      • Andrew –

        So let’s test that casual mechanism. Measure how carefully a representative sample of people drive on days that the stock market goes down. I mean things like speed of driving, allowing for appropriate distance from the cars ahead, etc.

        I propose a “causal mechanism” that people drive more *carefully* when they are upset about the stock market. Or alternatively, that they are more reckless when they’re happy about the stock market going up. Why is their proposed mechanism more or less plausible than mine?

  2. I think the causal analysis is a good reason to be skeptical, but hmm…

    Is there a causal-free reason to doubt analyses like this? Maybe something to do with lots of autocorrelation in both values leading to a smaller effective sample size than assumed?

  3. 1. “They’re all doing their best.” = “I’m not angry, just disappointed.” Ouch.

    2. I feel almost certain that the idea for this study came from a pun someone made in the lab involving the stock market crashing and stock cars crashing.

    3. It doesn’t seem too implausible to me that the performance of the stock market could have a butterfly effect that filters down to traffic, even if most or all drivers are oblivious. But that just brings us back to the piranha problem. That is, there are so many butterflies flapping their wings, they’d surely…eat each other?

    • “That is, there are so many butterflies flapping their wings, they’d surely…eat each other?”

      Having much experience around rivers I think of butterflies and piranhas as being analogous to eddies: an obstruction or event causes a change in the flow pattern of the river. The obstruction causes local deviation in the flow pattern but once the flow passes the obstruction the force of the flow overwhelms the deviation and smooths it out. The total flow of the river acts like the piranha, consuming the minor deviations.

      I’ve never seen an eddy amplify itself as it goes down a river.

      This comparison isn’t perfectly analogous to social events and I don’t want bother thinking through the differences. But in general it’s a good analogy. Even in social events only an infinitely small fraction of events are powerful enough to overwhelm the overall flow.

      I guess I’m trying to say butterfly effects might happen, but they must be very, very rare, because the piranha is a very powerful force.

  4. When I was in engineering undergrad, we were told always to do a rough back of the envelope calculation for whatever we were working on, to get an idea of the rough magnitude of the answers we might reasonably expect – along the lines of the James Harvey email. If our detailed calculation was in the right ball park, then good. If not, we were expected to explain why either the rough calculation or the detailed one was wrong, which would typically involve digging through the detailed calculation to figure out what was driving the divergence. Seems like a good practice (though I’m not an engineer now). Might it be adapted to social science? (Or maybe it is done already.)

    • “Might it be adapted to social science?” I think it would be a good practice for social scientists to adapt/adopt. However: I doubt that many social scientists would care to adopt it — my impression is that all too many social scientists like the current practices because they are “standard rules to follow”, while what Norman proposes involves serious thinking — and not thinking “how to back this up” but also giving “how to refute this” equal time.

      • Though I am not now and have never been an engineer, I find rough calculation to be very useful in my day-to-day life. In general, I think rough computation or “hand-simulation” is extremely valuable.

        I work largely in cognitive psychology, i.e. “boring” psychology, and in particular with models of the dynamics of processes involved in object recognition, memory retrieval, and simple motor control. Despite being extremely unsexy, these processes are definitely complex, resulting in models that are also often difficult to make intuitions about. In these circumstances, I find working through the models approximately, largely by hand, is extremely valuable. It’s funny because “rough” is usually thought of as “fast”, whereas it can take many hours to work through these complex models even in a rough way, in contrast to the minutes or seconds required for the computer. Even so, I think this effort is worthwhile because I’ve found myself on both sides: The simulation revealed a feature of the model that I hadn’t thought about; and I’ve discovered errors in implementation when the simulation failed to match my manual labors.

        Even though I don’t work in social science, many of the phenomena to be explained would seem to be similarly complex. As a result, I agree that an emphasis on rough computation would be similarly valuable in that field.

        I do teach students who have social science backgrounds. Seconding Martha’s point, I find that many social science students are explicitly taught *not* to rely on intuition or thinking things through themselves, but instead to rely on a set of “standard rules”. I think the motivation for this is that social systems are complicated, so intuitions shouldn’t be trusted?

        But as I said above, in my experience, complex systems are exactly the arena where thinking things through is most critical. I find I have to do a lot of “de-programming” to get students to realize they have that power locked away in their heads.

  5. For the day traders who keep a close eye on the markets who would plausibly be driving the results, I don’t think just drops would be necessarily a concern. I hear constant radio ads for a trading school that “teaches you to short the market.” Shorting is well know enough that expected increases might also lose people a lot of money if they screw it up.

    My basic it point is forking paths, I would think just overall increased volatility would be likely to cause problems as oppose to drops per say. The theory is not well pinned down.

  6. Imagine the roadway carnage caused by clients of this quant fund:

    “Events happened that statistically never could happen,” said the chief investment officer of Disciplined Equities in a telephone interview from St. Petersburg, Florida.

    Quigley spelled out the odds to clients in a note. As he computed it, the crash in the momentum factor was so rare that writing out the chances of occurrence on any given day required a 16-digit number — followed by 63 zeroes.”

    https://www.msn.com/en-us/money/markets/quant-shock-that-e2-80-98never-could-happen-e2-80-99-hits-wall-street-models/ar-BB1aZ3Ad?ocid=uxbndlbing

    • Thanks for the correction — the sentence seemed to be using “like” as in beatnik or valley girl talk, but that seemed rather unlikely for people commenting on this blog to use. I didn’t think of the (now obvious) typo explanation.

  7. I’m a little late on commenting here, but there is a another issue here relating to people in different fields or even different subfields of the same field not knowing about potentially useful datasets when they move outside their specific areas of research.

    Intraday data for the stock market overall is available from a number of sources. TAQ even has intraday trading for individual stocks in the US going back to 1993. Everyone in the academic side of finance knows about this, but few economists outside of finance seem to, and these authors definitely don’t.

    There was an interesting paper in the Journal of Finance in 2018 which was actually able to match hedge fund managers with the cars they own (that paper looked at car type and risk preference). That data could have been useful here but once again, the authors don’t seem to even know of that paper’s existence.

  8. Not having read the paper, I wonder whether it’s simply that both are outcomes due to (multiple) unobserved variables.

    Big market drops might be triggered either by bad financial/economic news or by really disturbing exogenous news – a mass shooting, an airline crash, a natural disaster. Perhaps bad exogenous news jars people into driving in a nonroutine way – more distractedly, at an atypical time, whatever – that results in more fatal accidents. They don’t have to pay any attention to the market, just to the same thing that triggered the market drop.

  9. I did a paper in MBA school in 1981 on whether New York Stock Exchange prices went up or down when the Yankees won or lost in the World Series. I didn’t see any solid results for prices, but volume was somewhat lower on days when the Yankees played in New York City during the World Series.

Leave a Reply to Rick Cancel reply

Your email address will not be published. Required fields are marked *