On a proposal to scale confidence intervals so that their overlap can be more easily interpreted

Posted on December 2, 2023 9:29 AM by Andrew

Greg Mayer writes:

Have you seen this paper by Frank Corotto, recently posted to a university depository?

It advocates a way of doing box plots using “comparative confidence intervals” based on Tukey’s HSD in lieu of traditional error bars. I would question whether the “Error Bar Overlap Myth” is really a myth (i.e. a widely shared and deeply rooted but imaginary way of understanding the world) or just a more or less occasional misunderstanding, but whatever it’s frequency, I thought you might be interested, given your longstanding aversion to box plots, and your challenge to the world to find a use for them. (I, BTW, am rather fond of dox plots.)

My reply: Clever but I can’t imagine ever using this method or recommending it to others. The abstract connects the idea to Tukey, and indeed the method reminds me of some of Tukey’s bad ideas from the 1950s involving multiple comparisons. I think the problem here is in thinking of “statistical significance” as a goal in the first place!

I’m not saying it was a bad idea for this paper to be written. The concept could be worth thinking about, even if I would not recommend it as a method. Not every idea has to be useful. Interesting is important too.

15 thoughts on “On a proposal to scale confidence intervals so that their overlap can be more easily interpreted”

paul alper on December 2, 2023 10:06 AM at 10:06 am said:

According to today’s blog, Greg Mayer wrote,

“I, BTW, am rather fond of dox plots.”

Is this a typo, or is there yet another field of endeavor that I am totally unaware of? Is it related to the plotting of “doxing”?

“The meaning of DOX is to publicly identify or publish private information about (someone) especially as a form of punishment or revenge.”

Reply ↓
- Gregory C. Mayer on December 2, 2023 11:02 AM at 11:02 am said:
  
  A dox plot is a box plot combined with a symmetric dot plot (“d”ot plus b”ox”). I don’t know if Leland Wilkinson coined the term, but it was one of Systat’s graphing options. For small data sets, it gives a nice set of summary information (median, hinges, range– the box plot part) along with a view of the entire distribution (the dots). For large samples, the dots can become too numerous and obscure what’s going on.
  
  Reply ↓
  - Gregory C. Mayer on December 2, 2023 11:29 AM at 11:29 am said:
    
    In a discussion here a while back, https://statmodeling.stat.columbia.edu/2023/08/16/confusions-about-inference-prediction-and-probability-of-superiority/, Anoneuoid pointed to viopoints, https://rdrr.io/cran/viopoints/man/viopoints.html, as in interesting visualization related to the dox plot.
    
    Reply ↓
  - paul alper on December 2, 2023 11:50 AM at 11:50 am said:
    
    Gregory C. Mayer: Many thanks for the clear explanation. I had, of course, tried Wikipedia but nothing came up regarding “dox plot.” So, I assumed that if Wikipedia did not have any info, therefore, it was merely a misprint for box plot(ting of points). Especially because “doxing” is now often in the news, there is (of course) a lengthy wiki on doxing
    
    https://en.wikipedia.org/wiki/Doxing#Notes
    
    “the act of publicly providing personally identifiable information about an individual or organization, usually via the Internet and without their consent to do so”
    
    which somehow, contains this reference to days preceding the existence of any current contributors to this blog
    
    “the Stamp Act 1765 in the Thirteen Colonies,”
    
    Reply ↓
- Howard Edwards on December 4, 2023 3:02 PM at 3:02 pm said:
  
  On the subject of portmanteau words, back in the 1980s Ed Dudewicz of Syracuse University came up with “selestimation” – simultanous selection (across several populations) and estimation of the largest quantile.
  
  Reply ↓
Dale Lehman on December 2, 2023 10:24 AM at 10:24 am said:

Suggested revision to the title: “1001 ways to use p-values to beat an analysis into submission.”
I think the referee was kind.

And, although many on this blog will agree with the paper’s statement about incorrect interpretation of confidence intervals (“I am x% confident that….”) I stand by my belief that such a statement is not terrible. The correct interpretation of a confidence interval is a mouthful, and while accurate, basically leaves the audience perplexed about why the analysis was done at all.

Reply ↓
- Noah Motion on December 3, 2023 12:29 PM at 12:29 pm said:
  
  Leaving an audience perplexed about why an analysis was done at all is, perhaps unfortunately, quite often entirely appropriate.
  
  Reply ↓
Dale Lehman on December 2, 2023 12:56 PM at 12:56 pm said:

I like your idea.

Reply ↓
- Dale Lehman on December 2, 2023 1:44 PM at 1:44 pm said:
  
  not me! must be a ghostwriter. And, who knows what idea they like?
  
  Reply ↓
Ali on December 2, 2023 1:01 PM at 1:01 pm said:

I don’t think it is a new idea but I haven’t read the full paper.

When two 95%CI do not overlap, we can conclude that the two means are different with about 99% confidence, p ~ .01.

To get 95% confidence to use non-overlapping confidence intervals, we can use ~ 85%CI.

One could easily show both intervals using whiskers at values corresponding to 85% and 95%.

Reply ↓
Tom Fiddaman on December 2, 2023 5:15 PM at 5:15 pm said:

I’m not sure how many examples are needed to make a myth, but here’s one.

In the Wisconsin deer trustee report, Kroll et al. write (page 50):

“Figure 14 presents graphs used in the planning document. The graphs imply (using fitted
exponential trend lines) an upward trend in infection rates, even for yearlings. Yet, the graphs
also present 95% confidence limits for each year; and, in every case these limits overlap. From a
statistical standpoint, this means there were no significant differences between years! ”

https://www.sco.wisc.edu/wp-content/uploads/2012/07/2012WisconsinDeerTrusteeReport.pdf

Reply ↓
Dean Eckles on December 2, 2023 9:50 PM at 9:50 pm said:

One dumb comment is that if you want to advertise something, you should provide it some format other than a Word document.

More substantively relevantly, I very much agree it is often useful to plot confidence intervals for differences compared with some reference level, as often even if you want to plot things on the original scale, the inference for comparisons is more important. I’ve done this many times. For an example in a current paper see Figure 4A & B here: https://arxiv.org/abs/1810.03579.

Reply ↓
- Dean Eckles on December 2, 2023 9:51 PM at 9:51 pm said:
  
  To be clear, my point about advertising is prompted by “Others have published on these intervals (the mathematical basis goes back to John Tukey) but here I advertise comparative confidence intervals in the hope that more people use them.”
  
  Reply ↓
John Christie on December 3, 2023 9:34 PM at 9:34 pm said:

I’ve used these kinds of bars frequently in the past in applied contexts where the client only wants to make comparisons visually quickly. For general science it’s not a good idea. We called them comparison bars. I’m pretty sure our group has a published paper using them. I know I’ve seen people using half LSD bars.

Reply ↓
Frank S. Corotto on September 12, 2024 3:02 PM at 3:02 pm said:

Thank you Dr. Gelman for your courteous opinion on the Corotto paper. I’m very sensitive to criticism. With comparative confidence intervals, you don’t have to think about “statistical significance”. I advocate the plotting of pairs of CCIs each with different alphas. For this, box and whisker plots are kindest to the eye. With pairs of CCIs, you can sit back, take in the big picture, and make up your own mind what about what to–provisionally–believe. Consider figure 2 in the linked paper. Imagine instead a table of paired differences in which you have to think about what is subtracted from what to determine the direction. Thanks for your time. I never understood type S errors until I saw one of your posts. –Frank S. Corotto https://digitalcommons.gaacademy.org/gjs/vol81/iss2/11/

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

On a proposal to scale confidence intervals so that their overlap can be more easily interpreted

15 thoughts on “On a proposal to scale confidence intervals so that their overlap can be more easily interpreted”

Leave a Reply Cancel reply