Sleep injury spineplot

Posted on October 8, 2020 10:39 PM by Andrew

Antony Unwin sends along the above graph in response to this recent post.

The data are kinda crap, but I agree with Antony that this plot is a good way of showing the number of cases corresponding to each ~~histogram~~ bar.

18 thoughts on “Sleep injury spineplot”

Wilte on October 9, 2020 2:25 AM at 2:25 am said:

I like this design, any chance the R code is available? And I always thought these graphs were called Marimeko-plots. Is a spineplot the same or are there differences?

Reply ↓
Rahul on October 9, 2020 4:14 AM at 4:14 am said:

I used to be a big fan of creating custom graphs to visualise specific questions.

But over a period of time I reached the conclusion that very often the custom graphs get in the way of effective communication.

If the viewer has to take extra effort to understand a non standard graph he’s either going to skip or worse draw wrong interpretations due to unfamiliarity.

Oftentimes the effort won’t scale since this could be the only time he sees this format.

Overall I now stick to standard formats and try not to misrepresent but usually avoid the added advantage of a custom representation due to the learning curve downsides.

Reply ↓
- Christian Hennig on October 9, 2020 5:18 AM at 5:18 am said:
  
  I think it is essential to distinguish between whether a graph is used for communication with others (and them with whom), or whether I as a statistician (or a team of experienced data analysts) use it for myself to get a good idea of what goes on. Your objection refers to communication, not to learning what’s in the data assuming that we understand the graph ourselves.
  
  Reply ↓
  - Rahul on October 9, 2020 8:20 AM at 8:20 am said:
    
    Yes, I agree with that. For self consumption anything goes really.
    
    Reply ↓
Thomas on October 9, 2020 5:21 AM at 5:21 am said:

Now the abscissa represents both sleep duration and group size.
What about confidence intervals on the proportions.

Reply ↓
- Adede on October 9, 2020 8:23 AM at 8:23 am said:
  
  Yep. I guess it’s nice to know the (relative) size of each group, but it’s impossible to tell from this graph if there is a sufficient number of people in each group to reliably estimate the chance of injury.
  
  Reply ↓
jim on October 9, 2020 10:03 AM at 10:03 am said:

This is a good way to present data.

It still doesn’t mean anything. Again it’s the self-reported data and “injuries” of high schoolers.

Reply ↓
- Andrew on October 9, 2020 10:08 AM at 10:08 am said:
  
  Yup.
  
  Reply ↓
  - jim on October 9, 2020 9:59 PM at 9:59 pm said:
    
    What’s funny is that even I feel compelled to analyze it, just because there it is. I caught myself thinking “oh, look, well now the story is a little different!” Humans are explainers – even when they’re explaining random variation.
    
    Reply ↓
    - Andrew on October 10, 2020 8:56 AM at 8:56 am said:
      
      Jim:
      
      To put it another way: Yes, each of these kids’ injuries does have an explanation—it’s just that most of these explanations will have nothing to do with sleep patterns. When focusing on this variable, there’s a tendency to forget all the other factors that cause injuries. This is similar to the problem we’ve seen with regression discontinuity analyses, that researchers focus on one predictor as if this is all that matters.
    - Martha (Smith) on October 10, 2020 5:48 PM at 5:48 pm said:
      
      Or their could be factors that influence both sleep patterns and incidence of injuries.
Antony Unwin on October 9, 2020 11:03 AM at 11:03 am said:

@Andrew It was not a histogram bar, because the original graphic was not a histogram. You could have written “corresponding to each bar in the original barchart.” You are not alone. Why do you, and others, dislike clearly distinguishing between the two kinds of plot?

@Wilte The graphic was produced using the doubledecker function in the R package vcd. Spineplots have been around since the mid-90’s and are excellent for linked interactive graphics. Doubledecker plots are the multivariate version. Computer scientists seem to like the name Marimeko instead.

@Rahul Spineplots are useful for all sorts of applications when you have categorical data: survey responses, medical studies, the Titanic dataset… How do you display bivariate (or indeed multivariate) categorical data?

@Jim I very much agree with both your comments!

Reply ↓
- Andrew on October 9, 2020 11:24 AM at 11:24 am said:
  
  Antony:
  
  You’re right—it’s not a histogram! I didn’t make the mistake on purpose; I was just being sloppy, reacting to what the original graph looked like rather than the information it was conveying. Thanks for pointing out my error.
  
  Reply ↓
- bjs12 on October 9, 2020 1:12 PM at 1:12 pm said:
  
  Also known as a mosaic plot. The function mosaicplot() is part of base R and can reproduce this plot, and can be used for higher dimension problems as well (such as the Titanic data).
  
  Reply ↓
  - Elin J Waring on October 9, 2020 4:08 PM at 4:08 pm said:
    
    +1 came to post this. I feel like it is underutilized, but that’s partly because the color scheme in base R makes them pretty ugly. It’s really a visualization of cross tabs.
    
    Reply ↓
- Rahul on October 10, 2020 9:13 AM at 9:13 am said:
  
  They should put a spineplot like this one on a data comprehension question on something like the GRE and see how the test takers do compared to more common plot types.
  
  Ironically the one thing most missing from the data visualization field is empiricism about effectiveness of various formats.
  
  Reply ↓
  - Andrew on October 10, 2020 9:40 AM at 9:40 am said:
    
    Rahul:
    
    There’s actually lots of empirical work on the effectiveness of data visualizations. But the empirical work isn’t always so useful because it’s not clear what empirical outcomes should be measured, given that the goal of a graph is typically to learn some general pattern. A graph is not just a look-up table.
    
    Reply ↓
    - Rahul on October 10, 2020 9:51 AM at 9:51 am said:
      
      So I rarely see people write “I think we should display this like that because study X showed it’s a really effective way”
      
      When it comes to graphs, legends, colors, ggplot,etc mostly I see reccomendation based on what looks like personal opinion.

Statistical Modeling, Causal Inference, and Social Science

Sleep injury spineplot

18 thoughts on “Sleep injury spineplot”

Leave a Reply Cancel reply