Skip to content
Archive of posts filed under the Statistical graphics category.

Computing with muffins

This is Jessica. Florian Echtler put together a list of weird human computer interaction papers that’s too good not to share. Urinal games, robots powered by household pests, and closeness with your remote partner through synchronized trash bins.  I recall other projects from HCI researchers that have stuck in my memory because there was a […]

Impressive visualizations of social mobility

An anonymous tipster points to this news article by Emily Badger, Claire Cain Miller, Adam Pearce, and Kevin Quealy featuring an amazing set of static and dynamic graphs.

The Alice Neel exhibition at the Metropolitan Museum of Art

This exhibit closes at the end of the month so I can’t put this one on the usual 6-month delay. (Sorry, “Is There a Replication Crisis in Finance?”, originally written in February—you’ll have to wait till the end of the year to be seen by the world.) I’d never heard of Neel before, which I […]

Webinar: Theories of Inference for Data Interactions

This post is by Eric. This Thursday, at 12 pm ET, Jessica Hullman is stopping by to talk to us about theories of inference for data interactions. You can register here. Abstract Research and development in computer science and statistics have produced increasingly sophisticated software interfaces for interactive and exploratory analysis, optimized for easy pattern […]

What can the anthropic principle tell us about visualization?

            Andrew’s post on the anthropic principle implies statistical problems are one of three types:  Those that are so easy that you don’t need stats (the signal is very strong relative to noise). Those that require stats because there’s some noise or confounding to be dealt with to recover the […]

Tableau and the Grammar of Graphics

The first edition of Lee Wilkinson’s book, The Grammar of Graphics came out in 1999. Whether or not you’ve heard of the book, if you’re an R user you’ve almost certainly indirectly heard about the concept, because . . . you know ggplot2? What do you think the “gg” in ggplot2 stands for? That’s right! […]

Doubting the IHME claims about excess deaths by country

The Institute for Health Metrics and Evaluation at the University of Washington (IHME) was recently claiming 900,000 excess deaths, but that doesn’t appear to be consistent with the above data. These graphs are from Ariel Karlinsky, who writes: The main point of the IHME report, that total COVID deaths, estimated by excess deaths, are much […]

Any graph should contain the seeds of its own destruction

The title of this post is a line that Jeff Lax liked from our post the other day. It’s been something we’ve been talking about a long time; the earliest reference I can find is here, but it had come up before then, I’m sure. The above histograms illustrate. The upper left plot averages away […]

size of bubbles in a bubble chart

(This post is by Yuling, not Andrew.) We like bubble charts. In particular, it is the go-to visualization template for binary outcomes (voting, election turnout, mortality…): stratify observations into groups, draw a scatter plot of proportions versus group feature, and use the bubble size to communicate the “group size”. To be concrete, below is a graph […]

Whassup with the weird state borders on this vaccine hesitancy map?

Luke Vrotsos writes: I thought you might find this interesting because it relates to questionable statistics getting a lot of media coverage. HHS has a set of county-level vaccine hesitancy estimates that I saw in the NYT this morning in this front-page article. It’s also been covered in the LA Times and lots of local […]

When can a predictive model improve by anticipating behavioral reactions to its predictions?

This is Jessica. Most of my research involves data interfaces in some way or another, and recently I’ve felt pulled toward asking more theoretical questions about what effects interfaces can or should have in different settings. For instance, the title of the post is one question I’ve started thinking about: In situations where a statistical […]

Hullman’s theorem of graphical perception

Any experimental measure of graphical perception will inevitably not measure what it’s intended to measure. I extracted this “theorem” from various comments Jessica has made regarding her skepticism about empirical studies of the effectiveness of statistical graphics. Of course we should be doing empirical studies all the time, but you-know-who is in the details, as […]

Let them log scale

This post may seem like it’s on a six month delay, but actually it’s not! Alexey Guzey sends a link to this blog post about a study done by some researchers at LSE and Yale earlier in pandemic history on how well understood log scales are. They randomly assigned 2000 American adults recruited online to […]

“Better Data Visualizations: A Guide for Scholars, Researchers, and Wonks”

Lee Wilkinson recommends this book by Jonathan Schwabish: I [Lee] think most books on “business” charts are junk, but this one is different. Schwabish does his work instead of incessantly quoting Tufte and online rants about pie charts. He’s one of the only writers on pie charts who seems to have read the research on […]

Tukeyian uphill battles

It seems that at least once a year, I find myself begging someone to make exploratory plots of some experimental data. I say begging because I have found that often when I’m being presented with some analysis and I ask questions like Did you plot all the variables first? or Did you look at this […]

Subtleties of discretized density plots

Many people are familiar with the idea that reformatting a probability as a frequency can sometimes help people better reason with it (such as on classic Bayesian reasoning problems involving conditional probability). In a visualization context, discretizing a representation of uncertainty, or really any probability distribution, can be useful for other reasons. For instance, by […]

Sketching the distribution of data vs. sketching the imagined distribution of data

Elliot Marsden writes: I was reading the recently published UK review of food and eating habits. The above figure caught my eye as it looked like the distribution of weight had radically changed, beyond just its mean shifting, over past decades. This would really change my beliefs! But in fact the distributional data wasn’t available […]

xkcd: “Curve-fitting methods and the messages they send”

We can’t go around linking to xkcd all the time or it would just fill up the blog, but this one is absolutely brilliant. You could use it as the basis for a statistics Ph.D. I came across it in this post from Palko, which is on the topic of that Dow 36,000 guy who […]

Most controversial posts of 2020

Last year we posted 635 entries on this blog. Above is a histogram of the number of comments on each of the posts. The bars are each of width 5, except that I made a special bar just for the posts with zero comments. There’s nothing special about zero here; some posts get only 1 […]

How many infectious people are likely to show up at an event?

Stephen Kissler and Yonatan Grad launched a Shiny app, Effective SARS-CoV-2 test sensitivity, to help you answer the question, How many infectious people are likely to show up to an event, given a screening test administered n days prior to the event? Here’s a screenshot. The app is based on some modeling they did with […]

Where can you find the best CBD products? CBD gummies made with vegan ingredients and CBD oils that are lab tested and 100% organic? Click here.