Better than a boxplot

I’d love if someone else were to write my article, tentatively titled “Better than a boxplot,” with the following abstract: “We demonstrate graphical options that dominate the boxplot. We hope that, once these alternatives are understood, boxplots are never used again.” But I have a horrible feeling I’m going to have to write this article itself.

13 thoughts on “Better than a boxplot

  1. Well, the abstract is a start. Now you just need the conference to send it to. Then you wait 6 months before you realise that the meeting is only a week away, so you'd better do something about it.

  2. The other option would be to find a journal I've never published in before where I could submit it. Then I'd do it right away.

  3. Maybe someone should write a plot function in R for jittered dot plots that is as easy to use as the boxplot function. This would influence me. The boxplot function is quite flexible, easy to use, and general. But usually when I make a jittered dotplot, i have to fiddle with it it a lot to get the dots small enough, to add the median lines, to make the column width wide enough to be able to see the dots…

  4. You could try Pharmaceutical programming. You do have to send WORD tho.

    I have to say I use Rob's Hdr boxplots where ever possible. Especially good if you may have bi-modality.


  5. Start by writing a single function in R that produces plots that dominate the boxplot, has good defaults, and also can deal with both large and small datasets in a good way.

    The boxplot is pretty darn useful, but I can easily believe that it can be dominated. I would use your function. Once you've got the function, a 2 page or 3 page paper about why it is better would be pretty easy.

  6. Doesn't it depend on the size of the data set and the goal of the graphical display (inference vs data description, exploratory vs publication)?

    For displaying data distributions, would it be reasonable to say

    * small to moderate: dotplots, jittered dotplots (a special challenge are small, discrete data sets, where there are lots of repeated values)
    * moderate: boxplots
    * moderate to large: violin plots, density strips, beanplots , box-percentile plots [Hmisc::panel.bpplot] etc. etc. etc..

    Or do you think the range in which boxplots are useful is squeezed out by the ranges of the "small to moderate" and "moderate to large" choices?

  7. Hi,

    the hdr boxplots are in the R package hdrcde you can get it at CRAN or see Rob Hyndman's page:
    I don't know of any implementations in SAS.

    Bimodality is very common in biological data from populations. Another good idea which is relatively easy to implement is to use histograms with quantiles as the cut points. This is much better for multimodality than standard histogram methods.

    See also the recent post from Andrew about histograms with more complicated ways to calculate the cut points.



  8. That bottom left graph from matlab is beautiful. Is there nothing like it in R?
    I agree that we are deeply in need of a better plotting function than R's boxplot(). It is the easiest to use in the robust way it handles data input, but doesn't give much in the way of robust output.

    If anyone knows of a way to make a plot like the matlab one Jesse linked, I'd sure love to hear about it.

Comments are closed.