Anders Lamberg writes:

In my mails to you [a few years ago], I told you about the Norwegian practice of monitoring proportion of escaped farmed salmon in wild populations. This practice results in a yearly updated list of the situation in each Norwegian salmon river (we have a total of 450 salmon rivers, but not all of them are surveyed – however, the most important ones are, more than 200 each year). There are several methods used to “extract” a sample and an estimate of proportion of farmed salmon from each surveyed river, and the big discussion has been: What does these methods give us? In practice it boils down to a statistical question: What is the precision of the estimates?

As I mentioned before, the number you get from a survey will have both ecological and economic implications. If the calculated proportion of escaped farmed salmon is above a defined limit, it can tell you about how the total productivity of the population of wild salmon in that river is affected. Wrong numbers may lead to the conclusion that there is no problem, when there is a problem that should be addressed. Proportions of farmed salmon above the critical limit, may also in the long run, lead to measures taken in the fish farming industry. New net pens that are more “escape safe”, reduction in the produced total volume of farmed fish, tagging of all farmed fish in the net pens in order to be able to track where escapees come from – are all good measures, but that cost a lot of money. That’s where it gets political. And to be slightly more specific: We do not speak millions, we speak billions.

In this drama you have several actors:

1. The government, democratically elected by the people. The ones who decide.

2. The government’s bureaucratic staff. The ones that gives advices to the ones that decide.

3. The scientists. The ones that produces the numbers for bureaucrats. But science in this context, is also a business. There are a lot of private or semi-private institutions participating in projects to acquire the relevant numbers. There are in fact no or few governmental employed scientists working with salmon, in Norway, only researchers in private companies.

4. The land owners that owns the rivers and earn money on sports fisheries tourism (too many escaped farmed salmon may results in fewer wild salmon)

5. The fish farmers that produce and sell farmed salmon

6. The lobbyists that speak on behalf of the different economic interests

a. The ones that are working for the fish farmers

b. The ones that are working for the land owners and sportfishing tourism

7. Environmental activists

a. Some work for the salmon itself and the future of their children (they have no direct economic incentive)

b. Some work under false flag and are working for the different groups of scientists or the industry (These lobbyists have economically gain from their effort)

8. The journalists

a. The ones that have friends in the fish farming industry (they may even own stock shares in the fish farming companies)

b. The ones that have friends among the land owners. These journalists often are keen sports fishers themselvesThe big drama is often “powered” by the scientists/researchers and a discussion about numbers and methods among them. The outcome of these discussions will decide which scientist gets money and who is not. This also leads to different publications that each supports different scientist groups. One such publication appeared late 2016 where one of the most recognised statisticians in Norway was asked to evaluate what could be inferred from estimates with wide confidence intervals, originating from a sampling method with few individual salmon sampled from each river. The conclusion in the report was that the wide confidence intervals made it impossible to use the result as a tool for managing neither the fish farming industry nor the wild salmon. Although this highly qualified statistician concluded, the scientists that got their project portfolio reduced by his conclusions, denied to accept it. What followed next is the big drama, where journalists were fed with information from the hurt scientists. The journalists were not able to understand what they actually wrote, but thought they were on safe grounds, because the whole story made sense with the reference to what they themselves believed in. The government was also not able to understand the statistical implications, but probably picked the conclusion that was in line with their overall goal: Wild salmon or farming industry, depending on which political party they represented. At the same time they wanted to be sure that their concussions would not back fire on them, if new data emerged. This in turn lead to meetings, some secret, some official. This drama lead partly to a situation where a high ranked politician had to leave his position as a minister. The confidence interval discussion was not the sole reason, but it sure helped on his resignation.

So what can we learn from this? How can we establish a scientific fundament that works as a robust tool for the best of business and nature? I think the answer lies in the statistics and the correct interpretation and realising that the average researcher actually do not understand completely what tool he/she use. It does not help that Neyman published the answer in 1937, because an average biologist is not able to understand what Neyman writes. That is why the responsibility lies on you professional statisticians. Here is what I feel is not communicated in a sufficiently clear way.

In my textbooks there is a huge difference between a point estimate and an interval estimate. To illustrate what I mean, the use of the wild salmon farmed salmon example is good. When you sample 50 salmon from a river and 5 of them shows up to be farmed salmon, the researcher publishes the value 10 % farmed salmon. He/she also publishes a 95 % confidence interval, but that interval is put in brackets, behind the 10 %. It is left in the dark shadows of the middle value in the sampling distribution. It is here I think the big mistake emerges. If you calculate a 95 % confidence interval, this calculation is with reference to the method (Neyman 1937) not to the real value. What is done, is that the 10 % is referred to as a point estimate by the biologists. This may be formally right, but misleading. A point estimate is a different thing. If it is a point estimate, it is a really bad one – with low probability to match the real value. What we actually have done here, is producing an interval estimate. It is correct that “our best guess” is 10 %, but I mean that this value should never be reported. Only the interval and the confidence level should be reported. Politicians, researchers, journalists etc… who does not understand fully what the results from the sample gives us, will misunderstand.

It gets even worse if you plot the probability curve for the outcome of the sampling of the 50 salmon which includes the 5 farmed fish. The visualisation by use of a curve where the 10 % emerges in the middle, leads people to think that it is highly probable that the real value, the value that should guide us to make the right decisions, lies close to 10 %. This is not what actually has been found. If you do the whole sampling once more, if you sample 50 new salmon from that population, it is likely that you get a very different result. May be 1 %. Then the drawing of the curve will make a totally new picture, for those (and that is the vast majority – including professors and the prime minister) who is not able to read Neyman 1937 and fully grasp the concept, they believe that they have found something quite accurate, unless they have seen the first curve, with 10 % in the middle, and take into account that this is from the same population, they would think that the new result, the 1 %, is from a totally different population. But it is not.

The other way of looking at it is to calculate a 50 % (instead of 95 %) confidence interval from the sample that gave us for example 10 % farmed salmon in the population. Most will understand that 50 % confidence interval is notsomething you would rely on to make serious political decisions. This interval would also have the value 10 % in the middle. For a non-statistician there is no difference between the 95 % and the 50 % interval curve plotted. Without referring to the interval itself, and only that, it will not be understood.

I think that an average biologist (or other researcher) mixes the concepts. They are used to measure for example body length of fish in a sample. They use sample statistics and get an estimate of population mean length from their sample. The calculate the distribution and are quite confident that they have a good representation of the population mean. But, when they calculate a confidence interval from a sample of proportion, they think the same way. The middle value in the distribution is close to the real value. If the sample size is large, there is of course no big problem. They do not see that wide confidence intervals must be interpreted in a different way.

I think the solution to this problem is that it “should be forbidden” to report the middle value of a confidence interval. Only the limits of the interval and the confidence level, should be reported. In this way you would force the researchers to think the right way. You will also communicate better to the public, the journalists and the politicians, if the middle value in confidence interval was never mentioned again. The middle value tells you very little in small samples where you only sample once. It is acquired by a random process with high probability that it jumps back and forth, if you were able to sample many times.

This was quite radical, but have I misunderstood this completely?

My reply: I don’t know that it’s much of a solution to not report the middle of the interval. I say this for a few reasons: First, the endpoints of an interval will in general be noisier than the midpoint. Second, I think it’s a big mistake to make decisions or inferential summaries based on whether an interval excludes zero. Third, if you’re using classical nonregularized estimates, then the middle has problems but so does the endpoints; consider for example some of the estimated effects of early childhood intervention.

That said, I agree with the general points expressed above. I see two big issues:

1. Lots of people want certainty when it’s not appropriate. We have to have a way of saying something in between “I know nothing” and “I’m sure.”

2. When data are sparse, you should be able to better using prior information, but there’s lots of resistance to doing that.

I think it would help to frame these problems more decision theoretically, rather than assuming they need to be mediated by some kind of summary statistics (posterior intervals, confidence intervals, etc.)

If you want to present a probability distribution and avoid some of these problems, how about converting to a cumulative formulation? Then you can provide probabilities that the true value is beyond a certain threshold (for a few decision-relevant thresholds) rather than trying to summarize directly with a point estimate or interval.

I’ve seen this, too. The same confusion can arise in an MCMC setting. For example, we report an MCMC standard error in Stan for all posterior expectations reported (e.g., parameters for posterior means and indicators for event probabilities). We also report some quantiles by default, which can be used to define the boundaries of posterior intervals. We’re in the process of moving default reporting to also report on convergence for the quantiles, including tail quantiles, which can show different properties than the convergence of means.

Hi Bob, that would be great! The N-eff needed to do a good job even for 95% intervals can be surprisingly large.

A posterior distribution of the estimate will be very informative. Different stakeholders are free to choose their priors and present their findings. That said, it likely won’t prevail over the tendency of humans to discard evidence not aligned with their prior beliefs, especially evidence generated from a process they don’t understand at all.

All of this assumes that an appropriate method – from the biological/enviro standpoint- has been used, I.e., that you’re not sampling 10 yards from a farmed salmon pen, that there are no behavioral diffs between farmed and natural that would mean your sample location has more or fewer farmed or natural salmon…so its not really just about stats. Sometimes “statistical” work fails before the calcs start.

No matter how you frame the results, all the nuance will be lost when it is filtered through a motivated layer of journalists/lobbyists/activists.

The alarmist side will say “as much as 40% of the salmon in our rivers are bio-engineered fish that have escaped the factory farms.” Activists will hear this as “40% of the salmon in our rivers are GMO FrankenSalmon,” and will repeat something even more extreme.

The apologist side will say that “alarmist fears are overblown, there is no hard evidence that the new salmon have entered our streams, much less that there has been any harm.”

This sounds familiar to me, since I was involved in the California Water Wars for many years, especially the parts dealing with salmon in the Sacramento River. This is a human problem, not a statistical one. Humans are social animals, so an effective response is to create a group including the technical people representing the various factions. In California, this is the California Water and Environmental Modeling Forum, which is run by the relevant scientists and engineers, and puts on an annual meeting as well as classes and workshops. Working together on these activities builds a bond between the technical people representing the various factions, and encourages them to build consensus. It hasn’t solved all the problems, but it has helped a lot.

Anders:

These plots might help https://statmodeling.stat.columbia.edu/2019/05/29/concurve-plots-consonance-curves-p-value-functions-and-s-value-functions/

They would at least would show how slowly “support” falls of as one moves away from the point estimate.

Now, enabling non-statistician to more fully grasp uncertainty and how statistics (tries to) adequately quantify it – is a real big open question.

Keith said,

“Now, enabling non-statisticians to more fully grasp uncertainty and how statistics (tries to) adequately quantify it – is a real big open question.”

In case it might be useful to others trying to tackle this problem — here’s a link to an attempt I once made at this:

https://web.ma.utexas.edu/users/mks/statmistakes/uncertainty.html

(Some of the external links are broken, but some of those can easily be retrieved by a simple search on the information given.)