Skip to content
 

Sleep injury spineplot

Antony Unwin sends along the above graph in response to this recent post.

The data are kinda crap, but I agree with Antony that this plot is a good way of showing the number of cases corresponding to each histogram bar.

18 Comments

  1. Wilte says:

    I like this design, any chance the R code is available? And I always thought these graphs were called Marimeko-plots. Is a spineplot the same or are there differences?

  2. Rahul says:

    I used to be a big fan of creating custom graphs to visualise specific questions.

    But over a period of time I reached the conclusion that very often the custom graphs get in the way of effective communication.

    If the viewer has to take extra effort to understand a non standard graph he’s either going to skip or worse draw wrong interpretations due to unfamiliarity.

    Oftentimes the effort won’t scale since this could be the only time he sees this format.

    Overall I now stick to standard formats and try not to misrepresent but usually avoid the added advantage of a custom representation due to the learning curve downsides.

    • Christian Hennig says:

      I think it is essential to distinguish between whether a graph is used for communication with others (and them with whom), or whether I as a statistician (or a team of experienced data analysts) use it for myself to get a good idea of what goes on. Your objection refers to communication, not to learning what’s in the data assuming that we understand the graph ourselves.

  3. Thomas says:

    Now the abscissa represents both sleep duration and group size.
    What about confidence intervals on the proportions.

    • Adede says:

      Yep. I guess it’s nice to know the (relative) size of each group, but it’s impossible to tell from this graph if there is a sufficient number of people in each group to reliably estimate the chance of injury.

  4. jim says:

    This is a good way to present data.

    It still doesn’t mean anything. Again it’s the self-reported data and “injuries” of high schoolers.

  5. Antony Unwin says:

    @Andrew It was not a histogram bar, because the original graphic was not a histogram. You could have written “corresponding to each bar in the original barchart.” You are not alone. Why do you, and others, dislike clearly distinguishing between the two kinds of plot?

    @Wilte The graphic was produced using the doubledecker function in the R package vcd. Spineplots have been around since the mid-90’s and are excellent for linked interactive graphics. Doubledecker plots are the multivariate version. Computer scientists seem to like the name Marimeko instead.

    @Rahul Spineplots are useful for all sorts of applications when you have categorical data: survey responses, medical studies, the Titanic dataset… How do you display bivariate (or indeed multivariate) categorical data?

    @Jim I very much agree with both your comments!

    • Andrew says:

      Antony:

      You’re right—it’s not a histogram! I didn’t make the mistake on purpose; I was just being sloppy, reacting to what the original graph looked like rather than the information it was conveying. Thanks for pointing out my error.

    • bjs12 says:

      Also known as a mosaic plot. The function mosaicplot() is part of base R and can reproduce this plot, and can be used for higher dimension problems as well (such as the Titanic data).

      • Elin J Waring says:

        +1 came to post this. I feel like it is underutilized, but that’s partly because the color scheme in base R makes them pretty ugly. It’s really a visualization of cross tabs.

    • Rahul says:

      They should put a spineplot like this one on a data comprehension question on something like the GRE and see how the test takers do compared to more common plot types.

      Ironically the one thing most missing from the data visualization field is empiricism about effectiveness of various formats.

      • Andrew says:

        Rahul:

        There’s actually lots of empirical work on the effectiveness of data visualizations. But the empirical work isn’t always so useful because it’s not clear what empirical outcomes should be measured, given that the goal of a graph is typically to learn some general pattern. A graph is not just a look-up table.

        • Rahul says:

          So I rarely see people write “I think we should display this like that because study X showed it’s a really effective way”

          When it comes to graphs, legends, colors, ggplot,etc mostly I see reccomendation based on what looks like personal opinion.

Leave a Reply