What’s the origin of the term “chasing noise” as applying to overinterpreting noisy patterns in data?

Roy Mendelssohn writes:

In an internal discussion at work I used the term “chasing noise”, which really grabbed a number of people involved in the discussion. Now my memory is I first saw the term (or something similar) in your blog. But it made me interested in who may have first used the term? Did you hear it first from someone, or have any idea of who may have first used the term, or something close to it?

My reply:

The term seems so natural. I don’t know if I heard it from somewhere. Here’s where I used it in 2013. I’ve also used the related term “noise mining.”

A quick google search came up with this 2012 article, Chasing Noise, by Brock Mendel and Andrei Shleifer in the Journal of Financial Economics, but they’re using the term slightly differently, referring not to overfitting explanations of noisy statistical findings, but to random economic behavior.

Roy then gave some background:

The term came up in the setting that I am a firm believer that if you ignore spatial and temporal correlation in space-time data, as many analyses do, you are uncovering patterns that are transitory in the dynamics sense, either because you have over estimated the effective sample size (as when the talks on Stan talk about ESS for analyzing the chains) or you are just being fooled by the seeming patterns caused by noise when data are dependent (actually even when they are independent – when state lotteries started I knew quite a few people who were positive they had found a pattern in the numbers, and sure enough they all lost a fair amount of money).

Anyway, if any of you know further history on this use of the expression “chasing noise” as applying to overinterpretation of noisy patterns in data, please let us know in comments.

10 thoughts on “What’s the origin of the term “chasing noise” as applying to overinterpreting noisy patterns in data?

  1. I certainly don’t know the “origin”, but a quick googling turns up this earlier quote from Wolfinger (of the SAS institute) et al. 2001:

    “Proper determination of significance prevents researchers from “chasing noise” and helps them appropriately distinguish between important biological changes and chance variation (Wittes and Friedman, 1999).”

    Surely that speaks about overinterpretation of noisy patterns in data? More generally though, isn’t it a natural analogy from the engineering field’s use of “signal and noise”? Hard to tell who made the analogy first.

    Wolfinger et al. 2001. Assessing Gene Significance from cDNA Microarray Expression Data via Mixed Models. JOURNAL OF COMPUTATIONAL BIOLOGY
    Volume 8, Number 6, 2001

  2. I’m shocked to think the phrase might have appeared so recently. We can push the date back to 2010, which is when the paper cited above was first published online as an NBER working paper (https://papers.ssrn.com/sol3/papers.cfm?abstract_id=1617044). The phrase may have originated in the context of electrical engineering, where “chasing noise” means pursuing the source of interference in a signal so that it can be eliminated–the opposite of promoting noise as valid information. For example, “…chasing noise problems can be both frustrating and time-consuming” (Whitlock, 2008). But the phrase had made the jump to being used critically at least by 2000. In “CHASING ANOMALOUS SIGNALS” (Jones, 2000), the author laments:

    “It is high time to strongly question claims of cold fusion based on crude techniques and to demand tests at a rigorous scientific-proof level. Compelling evidence requires use of the best instruments available, incorporating fast data-sampling and digitization methods, for instance. Different detectors must give signals which agree quantitatively. A real signal will be capable of scaling, and should not shrink as background levels are reduced. Otherwise, the researcher may well be chasing noise, and probably making noise as well. (Hyping questionable results in the media seems to be a characteristic practice of those claiming excess heat, which we never did.) I have not seen any compelling evidence for any “cold fusion” effects, to date.”

    Adjust the terminology (from cold fusion to himmicanes and detectors to samples and instruments to statistical tests), and this passage is virtually indistinguishable from a blog post Andrew might have written!

    I don’t know if this is the origin, but the author clearly knew the common meaning of the phrase in the context of troubleshooting electronic equipment, and he seems to think his audience would be familiar enough with it to appreciate his wordplay–using noise at first to mean signal interference or an anomalous signal, then as a ruckus.

  3. This seems like a similar usage from 1978:

    “One should be cautioned that adaptive models react to any change in demand, whether it is a change in pattern or noise. We like the adaptive feature of changing to changing demand patterns but dislike the feature of “chasing noise around” that adaptive models tend to have. Weaknesses in this concept include the chasing of noise in the series …”

    Everett E. Adam, ‎Ronald J. Ebert, 1978.
    Production and Operations Management: Concepts, Models, and Behavior
    https://books.google.com/books?id=DlYgIYZKFUMC

  4. I recall hearing it (not for the first time) in a mid-1990s discussion where it was, I think, attributed to Deming. I can’t find a good reference for that (*), but I wonder if it’s really a 1920s Shewhart phrase.

    *That’s not quite true: I did find “Chasing the noise: W. Edwards Deming would be spinning in his grave” (https://statmodeling.stat.columbia.edu/2013/10/24/chasing-the-noise-w-edwards-deming-would-be-spinning-in-his-grave/), but that doesn’t give a reference. :-)

  5. “Chasing noise” is a term I’ve encountered often in books and articles about Statistical Process Control. I have the impression that it originated with Deming, but I actually don’t know that for sure. Donald Wheeler uses it – we could ask him.

Leave a Reply to Bill Harris Cancel reply

Your email address will not be published. Required fields are marked *