Paul Alper writes:
Here is a fascinating article by Matthew Cappucci from the Washington Post dealing with the difficulty experts have when trying to convey technical results to the lay public.
In a nutshell, the categories the experts at the Storm Prediction Center use:
marginal, slight, enhanced, moderate or high
do not correspond to the linguistic feelings of the general public, the intended audience of the typical TV/radio weather forecaster:
It’s been shown time and time again, however, that the general public doesn’t understand the categories. Is a slight risk or a marginal risk more significant? And why is moderate risk not in the middle? What’s an enhanced risk? Meteorologists know the system like the back of their hand. But the public does not.
A student, Alex Forbes, at Mississippi State University gathered the following data. It shows that the public is all over the linguistic map when it comes to ordering and interpretation of the terms used.
From the above graphic, “High” is the only category in which the experts and the general public are in agreement.
As part of his project, Forbes asked nearly 4,000 respondents to rank the presumed order of the SPC’s five categories based on their implied severities. On the whole, respondents got 4 out of the 5 categories wrong.
“The SPC outlook was never meant for public consumption,” he said. “It still isn’t to this day. The only reason it’s published is because [the SPC is] required to by federal law as a federal agency.”
According to Patrick Marsh, chief of science support at NOAA’s Storm Prediction Center the problem of misinterpretation began when:
After reviewing feedback from emergency management, they broke it into three categories: marginal, slight, and enhanced. That’s when public confusion really ramped up.
“There is no real word that fits between ‘slight’ and ‘moderate’ that works,” Marsh said. “‘Enhanced’ showed a little bit more promise. But SPC knew there was going to be a problem there. The words aren’t perfect.”Sean Ernst, a student at the University of Oklahoma points out that the SPC terminologies
“were never designed necessarily with the general public in mind.”
“You can have a perfectly accurate forecast,” Ernst said, “but a forecast has no value unless a user can make an educated decision based on it.”
Obviously, this problem of miscommunication between experts and the lay public is not confined to weather forecasting. The term “statistically significant” comes readily to mind. As does “peer review,” “margin of error” and many others. So, how to do better when defining terms in order to communicate with the general public?
I’m reminded of the saying, “It’s harder to do the wrong thing right than to do the right thing right.”
Also, the numbers in the table above should be rounded to the nearest percentage point. “38.6%,” indeed.
If they want a widely accessible and easily understood scale nowadays, I’d suggest the following three levels.
OMG
OMG!!
OMFG!!!!
I remember a paper about communications in the workplace from a long time ago. There was a nice official sign on a machine, something like “Inappropriate operation of this device may have serious or lethal consequences”.
A handwritten sign beside it read’ “THIS MACHINE CAN KILL YOU”.
Andrew failed to highlight that the Washington Post article is dated June 10, 2020. Practically pre-history in this Covid era and less than a year after Hurricane Dorian and Sharpiegate of September, 2019:
https://en.wikipedia.org/wiki/Hurricane_Dorian%E2%80%93Alabama_controversy
Note that Sharpiegate is an extreme example of rounding “to the nearest percentage point.”
“A third investigation being done by a committee of the U.S. House of Representatives has not yet been released.”
Speaking of p values, maybe we should create a new “Office of Numerical Assessment of Multiply Investigated Events” to perform a statistical significance analysis on the population of all investigations into each multiply-investigated event!
I don’t think the weather example is a good one for the difficulty of communicating between experts and the public. The weather warnings have always been unintelligible for most – confusing terms with no real logic. Of course, those that use it regularly know the differences, but they also know that the terms are not good choices for anyone else. The only interesting question (to me) is why it has taken so long to think about changing it.
Terms like “statistical significance” are not merely jargon shared within an expert community. The meaning of the term is not simply definitional – it represents more subtle and substantive issues (as evidenced by numerous discussions on this blog and elsewhere). A person’s lack of understanding of the term may represent a lack of understanding of evidence, randomness, probability, or other fairly deep and controversial topics.
Failure to understand whether “marginal” is more or less severe than “enhanced” or “slight” represents nothing more than a failure to know how these terms were defined – and, a failure of those who designed the terms to be willing to (or able to) change to more easily understood language. Just try to define “statistical significance” in easily understood language (I suspect some people will now try and I’ll let their attempts be evidence about my assertions).
+1
Another difference with “statistical significance” is that, while most meteorologists apparently don’t have a problem with the scale wording, most of the professionals/experts who use “statistically significant” don’t know what it means, or use it inappropriately, or they both understand it and use it appropriately as terminology but then ignore its proper meaning and import in practice (e.g., editors or reviewers who insist on giving such results undue weight and non-significant results discounted weight).
To paraphrase: “You can have a perfectly accurate statistical conclusion, but a statistical conclusion has no value unless an author can assert a valid research conclusion based on it.”
I would argue the weather example is a good one for precisely the same reasons you outlined.
Compared to explaining “statistical significance” to the public,
coming up with a one-dimensional scale of risk that the public could easily grasp is trivially easy
(1-5, S-M-L-XL, even short-tall-grande-venti would be better than marginal-slight-enhanced-moderate)
yet we still fail at it.
Yes, it was not originally designed for public consumption and changing a naming convention can be surprisingly costly.
I get it but it does not change the fact that the public does consume them now.
In the grand scheme of things, it is not a huge problem, but it is instructive
of how scientific communication can fail and continue to do so even with relatively easy solutions (and no competing agenda).
> changing a naming convention can be surprisingly costly
A coworker yesterday pointed out that in a situation like this something like changing the names to match the survey is the bad thing to do. This creates ambiguity between the old names and new names which would require carefully documenting the switch in all the systems (not gonna work).
If you completely replace all the words though, so there’s an unambiguous mapping from old to new, then it’s as easy as it gets (though still takes work) to have downstream systems automatically convert to the new thing on reading and you just forget the old thing even exists.
?? I’m not sure how this is an example of the difficulty of communicating to the lay public.
Here we go: “1” = low risk…2, 3, 4, 5 = “high risk”. Is this too difficult?
Who would know the difference between “Marginal”, “Slight”, and “Moderate” without reading a specific definition? These are dumb words for weather forecasters even to use among themselves. This is a good example of how scientists get wrapped up in jargon and can’t let it go.
IMO scientists often have a problem because they want their words to mean too many things at once. You can’t collapse three or four dimensions of variation (precip, windspeed, wind duration, max gust, flood risk etc) into a single verbal scale and expect the public to get it, since it can’t be done sensibly anyway.
That would be a definite improvement, given the public’s exposure to hurricane and tornado numerical categories. The only people we’d have to worry about are the 2.4% who get confused by whether 1 is highest danger or lowest danger, even when they’re told explicitly in the survey question…
I think the other issue is that it might not be obvious what the number is “out of”. Is 5 the highest, or is it 10? Or 100?
Personally I’d go for Very Low, Low, Medium, High, Very High. Simple and unambiguous.
+many
“Very Low, Low, Medium, High, Very High. Simple and unambiguous.”
Yeah but the stoners would get that one backwards.
“are the 2.4% who get confused by whether 1 is highest danger or lowest danger”
I’d guess that your estimate is low by an order of magnitude. At least.
+1
Why map a continuous decimal to a category in the first place?
Why not just stick to a number. 95, 30 something like that?!
Do these categories refer to severity of storm? Probability of a severe storm? If the latter, why not just give the numeric value?
I’m not sure I have five distinct responses to severe weather warnings.
Slight/marginal/nominal/small/outside chance of severe weather: Keep an eye open.
Moderate/enhanced/reasonably good/somewhat likely/better than not: Factor this into my plans.
High/very high/all-but-certain/OMFG: Consider staying home.
Andrew, you may recall (but probably don’t) that the bowling alley we used to go to in White Oak, Maryland, had a snack bar that sold drinks in three sizes: large, extra-large, and super-jumbo.
I’m more or less in agreement with jim or even Brent…certainly in agreement with the thrust of a lot of these comments, which is that terms such as ‘marginal’ and ‘enhanced’ are so obviously terrible that whoever suggests them should be shot, or at least fired. (OK, I went a bit farther than the other commenters, but anyway we all agree that whoever is coming up with these terms at the Storm Prediction Center is doing a lousy job).
I’m imagining people trying to “rank the following words from least to most threatening” if the words were, say: “very low, low, medium, high, very high.” I’d bet more than half the people taking the survey would get every one of those in the right order!
Perhaps the SPC has a rule that they have to use single words, so “very low” and “very high” are not allowed? Well, hey, I have a fix for that: “tiny, low, medium, high, extreme.” Now, admittedly “extreme” could conceivably be mistaken for “extremely low”, or maybe you could imagine it being in between medium and high, but I’d still bet that this would be alright.
If “tiny” is no good because it could be mistaken for the spatial extent of a storm rather than its risk, then maybe “innocuous” or “low-risk” (if that doesn’t violate the one-word rule, if there is such a rule).
In any case I think it is not hard to come up with categories that are way better than the ones they have.
Phil said “certainly in agreement with the thrust of a lot of these comments, which is that terms such as ‘marginal’ and ‘enhanced’ are so obviously terrible that whoever suggests them should be shot, or at least fired.”
How about boiled in oil — but they have to specify whether the oil is boiling at a “marginal” or “enhanced” rate. ;~)
Maybe let them choose between sautéed, seared, pan-fried, braised, broiled, or poached.
Phil –
> In any case I think it is not hard to come up with categories that are way better than the ones they have.
If I could offer some suggestions:
Extremely bad.
Bad.
Pretty bad.
Not so bad.
Not so good.
Meh.
Piddly.
I’ve lived in Minnesota long enough to tell you that “not so bad” and “not so good” need to switch places.
Jeff –
I actually thought about that. So my system ain’t so great either. Or would that be it’s pretty bad also?
Could be worse.
Is this akin to
“Could care less” (US)
Vs
“Couldn’t care less” (other English speakers)?
“ the bowling alley we used to go to in White Oak, Maryland, had a snack bar that sold drinks in three sizes: large, extra-large, and super-jumbo”.
Sound like the old joke about sizes on packets of condoms.
Seems like the solution would just be to give storms names that convey their strength, perhaps making use of the public’s sexism…
Wins thread.
I think if scientists were a bigger part of public life, maybe some of these problems would go away because (a) more of the general public would have more familiarity with the common terms various scientific communities use and (b) interacting with public more might clue scientists in to just how confusing a lot of our current terminology is. But, nothing gets more push back by scientists on the topic of improving science literacy than the suggestion that maybe we should spend some time actually interacting with the common folk. There’s this really peculiar prevalent feeling across academia that talking with (not to, with) the general public is beneath them, which unfortunately leaves a vacuum to be filled by the likes of the Jordan Petersons and Steven Pinkers of the world, who aren’t helping things in the slightest.
This does not match my experience at all. It’s generally not hard to get scientists to give public talks, for example, and the limiting factors for activities like this tend to be time and good venues for engagement. I have literally never met anyone in the past 15 years who considers interacting with the general public “beneath them.” I’m curious what sort of institution you’re at. (Me: public R1 in the US.)
It’s easy to design a good 5-point scale from scratch, but what actually happened on Oct 22, 2014, was an update from a 4-part scale (see text – slight – moderate – high) that replaced “see text” with “marginal” and split “slight” into “slight” and “enhanced”.
Source: https://www.washingtonpost.com/news/capital-weather-gang/wp/2014/10/17/storm-prediction-center-thunderstorm-outlook-enhancements-go-live-next-week/
If the condition is to not change established meanings, the solution isn’t so easy.