Is explainability the new uncertainty?

This is Jessica. Last August, NIST published a draft document describing four principles of explainable AI. They asked for feedback from the public at large, to “stimulate a conversation about what we should expect of our decision-making devices‘’. 

I find it interesting because from a quick skim, it seems like NIST is stepping into some murkier territory than usual. 

The first principle suggests that all AI outputs resulting from querying a system should come with explanation: “AI systems should deliver accompanying evidence or reasons for all their outputs.” From the motivation they provide in the first few pages: 

“Based on these calls for explainable systems [40], it can be assumed that the failure to articulate the rationale for an answer can affect the level of trust users will grant that system. Suspicions that the system is biased or unfair can raise concerns about harm to oneself and to society [102]. This may slow societal acceptance and adoption of the technology, as members of the general public oftentimes place the burden of meeting societal goals on manufacturers and programmers themselves [27, 102]. Therefore, in terms of societal acceptance and trust, developers of AI systems may need to consider that multiple attributes of an AI system can influence public perception of the system. Explainable AI is one of several properties that characterize trust in AI systems [83, 92].”

NIST is an organization whose mission involves increasing trust in tech, so thinking about what society wants and needs is not crazy. The report summarizes a lot of recent work in explainable AI to back their principles, and acknowledges at points that techniques are still being developed. Still, I find the report to be kind of a bold statement. Explainable AI is pretty open to interpretation, as the research is still developing. The report describes ideals but on what seems like still underspecified ground.

The second principle, for instance, is:

Systems should provide explanations that are meaningful or understandable to individual users.

They expound on this: 

A system fulfills the Meaningful principle if the recipient understands the system’s explanations. Generally, this principle is fulfilled if a user can understand the explanation, and/or it is useful to complete a task. This principle does not imply that the explanation is one size fits all. Multiple groups of users for a system may require different explanations. The Meaningful principle allows for explanations which are tailored to each of the user groups. Groups may be defined broadly as the developers of a system vs. end-users of a system; lawyers/judges vs. juries; etc. The goals and desiderata for these groups may vary. For example, what is meaningful to a forensic practitioner may be different than what is meaningful to a juror [31]. 

Later they mention the difficulty of modeling the human model interaction:

As people gain experience with a task, what they consider a meaningful explanation will likely change [10, 35, 57, 72, 73]. Therefore, meaningfulness is influenced by a combination of the AI system’s explanation and a person’s prior knowledge, experiences, and mental processes. All of the factors that influence meaningfulness contribute to the difficulty in modeling the interface between AI and humans. Developing systems that produce meaningful explanations need to account for both computational and human factors [22, 58]

I would like to interpret this as saying we need to model the “system” comprised of the human and the model, which is a direction of some recent work in human AI complementarity, though it’s not clear how much the report intends formal modeling versus simply considering things like users’ prior knowledge. Both are hard, of course, and places where research is still very early along. As far as I know, most of the work in AI explainability and interpretability—with the former applying to any system for which reasoning can be generated and the latter applying to “self-explainable” models that people can understand more or less on their own—is still about developing different techniques. Less is on studying their effects, and even less about applying any formal framework to understand both the human and the model together. And relatively little has involved computer scientists teaming up with non-computer scientists to tackle more of the human side. 

The third principle requires “explanation accuracy”:

  • The explanation correctly reflects the system’s process for generating the output.

Seems reasonable enough but finding accurate ways to explain model predictions is not an easy problem. I’m reminded of some recent work in interpretability showing that some very “intuitive” approaches to deep neural net explanations like giving the salient features of an input given the output sometimes don’t exhibit basic traits we’d expect, like generating the same explanations when two models produce the same outputs for a set of inputs even if they have different architectures. 

Also, especially when the explanations must correctly reflect the system’s process, then they will often entail introducing the user to features or counterfactuals and may include probability to convey the model’s confidence in the prediction. This is all extra information for the end-user to process. It makes me think of the broader challenge of expressing uncertainty with model estimates. AI explanations, like expressions of uncertainty, become an extra thing that users have to make sense of in whatever decision context they’re in. “As-if optimization” where you assume the point estimate or prediction is correct and go ahead with your decision, becomes harder. 

One thing I find interesting though is how much more urgency there seems to be in examples like the NIST report or popular tech press around explainable AI relative to the idea that we need to make any outputs of statistical models more useful by expressing uncertainty. NIST has addressed the latter in multiple reports, though never with the implied urgency and human-centric focus here. It’s not like there’s a shortage of examples where misinterpreting uncertainty in model estimates led to bad choices on the parts of individuals, governments, etc. Suggesting AI predictions require explanations and measurements require uncertainty expressions come from a similar motivation that modelers owe their users provenance information so they can make more informed decisions. However, the NIST reports on uncertainty have not really discussed public needs or requirements implied by human cognition, focusing instead on defining different sources and error measurements. Maybe times have changed and if NIST did an uncertainty report now, it would be much less dry and technical, stressing the importance of understanding what people can tolerate or need. Or maybe it’s a branding thing. Explainable AI sounds like an answer to people’s deepest fears of being outsmarted by machine. Uncertainty communication just sounds like a problem.  

At any rate, something I’ve seen again and again in my research, and which is well known in JDM work on reasoning under uncertainty is the pervasiveness of mental approximations or heuristics people use to simplify how they use the extra information, even in the case of seemingly “optimal” representations. It didn’t really surprise me to learn that heuristics have come up in the explainable AI lit recently. For instance, some work argues people rarely engage analytically with each individual AI recommendation and explanation; instead they develop general heuristics about whether and when to follow the AI suggestions, accurate or not. 

In contrast to uncertainty, though, which is sometimes seen as risky since it might confuse end-users of some predictive system, explainability often gets seen as a positive thing. Yet it’s pretty well established at this point that people overrely on AI recommendations in many settings, and explainability does not necessarily help as we might hope. For instance, a recent paper finds they increase overreliance independent of correctness of the AI recommendation. So the relationship more explanation = more trust should not be assumed when trust is mentioned as in the NIST report, just like it shouldn’t be assumed that more expression of uncertainty = more trust. 

So, lots of similarities on the surface. Though not (yet) much overlap yet between uncertainty expression/comm research and explainable AI research. 

The final principle, since I’ve gone through the other three, is about constraints on use of a model: 

The system only operates under conditions for which it was designed or when the system reaches a sufficient confidence in its output. (The idea is that if a system has insufficient confidence in its decision, it should not supply a decision to the user.

Elaborated a bit:

The previous principles implicitly assume that a system is operating within its knowledge limits. This Knowledge Limits principle states that systems identify cases they were not designed or approved to operate, or their answers are not reliable. 

Another good idea, but hard. Understanding and expressing dataset limitations is also a growing area of research (I like the term Dataset Cartography, for instance). I can’t help but wonder, why does this property tend to only come up when we talk about AI/ML models rather than statistical models used to inform real world decisions more broadly? Is it because statistical modeling outside of ML is seen as being more about understanding parameter relationships than making decisions? While examples like Bayesian forecasting models are not black boxes the way deep neural nets are, there’s still lots of room for end-users to misinterpret how they work or how reliable their predictions are (election forecasting for example). Or maybe because the datasets tend to be larger and are often domain general in AI, there’s more worries about overlooking mismatches in development versus use population, and explanations are a way to guard against that. I kind of doubt there’s a single strong reason to worry so much more about AI/ML models than other predictive models.     

27 thoughts on “Is explainability the new uncertainty?

  1. Listen, and understand. That AI system is out there. It can’t be bargained with. It can’t be reasoned with. It doesn’t feel pity, or remorse, or fear. And it absolutely will not stop to deliver explanations, ever, until you are dead.

  2. I can’t help but wonder, why does this property tend to only come up when we talk about AI/ML models rather than statistical models used to inform real world decisions more broadly?

    People who attempt interpret the coefficients of a linear model think they know what they mean, but they don’t. If they did, then they would realize they are arbitrary values that depend on whatever data was available to include and whatever arbitrary choices about model specification were made. Therefore, there is no reason to place importance on those values.

    A neural network is superior because you have no illusion that you can meaningfully interpret the coefficients.

    • Anon:

      I disagree. We have a lot in Regression and Other Stories about interpretation of linear and logistic regression, and I stand by what we wrote. I agree that sometimes linear regression can be a complicated mess, but often it can be understood.

      • For example:

        The dependent variable is assumed to be a linear function of the variables specified in the model.

        https://en.wikipedia.org/wiki/Gauss%E2%80%93Markov_theorem#Proof

        Unless the model is correctly specified the parameters/coefficients are arbitrary.

        At a minimum, you need to include all relevant variables and no irrelevant ones (of course, negligible variables can be ignored in real life). What real world application of linear regression meets this criteria? In every case people use the data available to them, which is arbitrary.

        Then there is choosing the interactions and other model specifications. It was shown empirically in the paper discussed here:
        https://statmodeling.stat.columbia.edu/2019/06/27/the-garden-of-603979752-forking-paths/

        They come up with hundreds of millions equally acceptable arbitrary linear models, which yielded about 50% positive and 50% negative values for the coefficient in question. Then conclude this was only the tip of the iceberg:

        For the sake of simplicity and comparison, simple linear regressions were used in this study, overlooking the fact that the relationship of interest is probably more complex, non-linear or hierarchical 13 .

        That is why the model needs to be derived from a set of agreed upon assumptions if you want to interpret the coefficients, instead of just using the model to predict something (as a neural network is used).

        • Anon:

          I’m talking about our recent book here, not about wikipedia! I’m not saying that understanding regression is easy, just that it can be done if we are carefully.

        • How is it possible you choose the correct model out of hundreds of millions (or more) equally plausible specifications using only data that happened to be available?

        • Sometimes there aren’t hundreds of millions of equally plausible specifications. It depends a lot on what you work on. For example Andrew sometimes consults on pharmacokinetic models. I’m working with people on biomechanics models.

          I agree there are many many places where people just choose to regress on whatever variables they have available and then claim causal inference, but it’s not the only thing out there.

        • I think the term “linear regression” is overloaded. Linear regression variously means:

          1. a model where Y is a linear function of covariates, where covariates are literally specified (no x_i^2 in the design matrix)
          2. a model fit with OLS
          3. a model where Y ~ X b where the design matrix X is unrestricted

          If your design matrix is a bunch of literally represented covariates, I’d wager you’re just using a convenient set of attributes to do a prediction. In my experience, when building a principled model using domain-knowledge influenced assumptions, I never end up with a model that’s linear in the first sense because

          1. log(x) is typically more meaningful because the real world operates on a proportional scale
          2. Most social or psychological effects tend to saturate in limits. As I continue to make a website faster, eventually I approach the boundaries of human perception and the effect on people’s ad clicking disappears

        • Those pharmacokinetic are derived from assumptions people found reasonable. Although the assumptions may fail the coefficients do have meaning.

          That isn’t the same as the y = b0*x0 + b1*x1 + … + bn*xn type model used everywhere to “adjust” for various confounding factors.

          In that case the meaning and value of b0 depends on what other variables are included and there is no reason to choose one specification over another.

        • Well then how can you interpret the coefficients since if you change the model to another equally plausible one the value will change?

          And there are millions upon millions of such equally plausible models, most of which you cant even check in practice because the data is not available.

  3. Jessica:

    I guess one reason that “explainable AI” is so popular is that it means different things to different people. I like the idea of explainable AI because to me it represents tools for understanding fitted models. For other people, explanable AI is something like taking the predictions from a deep learning model and fitting a logistic regression. When the definition is vague enuf, it’s like mom and apple pie and everyone can love it

  4. I think one of your questions is why so much more attention to explainability in AI/ML systems than to uncertainty in statistical models. I’m neither a computer scientist nor a statistician but a quantitative social scientist who’s interested in both. Could it be something as simple as that designers of AI/ML systems are under more pressure from end users than are statistical modelers?

    We all hear about AI/ML systems being biased against Blacks, women, etc. as well as how consequential some of the decisions they’re used to guide are (parole, probation, loans, employment, etc.). Other than, as you suggested, in the area of polling (or maybe around Census time), I don’t think the public gets as exercised about “statistical models gone wrong” or how consequential such models can be. Maybe researchers in AI/ML, in terms of how they allocate their time, are simply responding to this difference.

  5. Agree that “explainable AI” is a mess. I suspect it has a lot to do protecting past investments in learning and promoting complex models like deep neural nets, complicated messy linear regressions, etc. _where they actually have no advanatage_ given understanding is important.

    One can divide off getting inherently interpretable models for applications where they have adequate accuracy. These are purposely constrained so that their reasoning processes are more understandable to humans. This not only makes the connection between input data and predictions almost obvious, they also are much easier to troubleshoot and modify as becomes needed.

    However, it is the prediction model that is understandable, not necessarily the prediction task itself. So once you bring in the human I believe you have to address the the prediction task itself.

    Anyway, an inro to a current draft of expalining the differences between interpretable and explainable.

    Explore building accurate interpretable machine learning: Why choose not to understand where and when you can?

    Introduction.
    In spite of an increasing number of examples where simple prediction models have been found to predict as accurately as complex prediction models, a widespread myth persists that accurate prediction requires models that need to be too complex for most to understand. The often suggested simple remedy for this unmanageable complexity is just finding ways to explain these black box models. However, those explanations are seldom fully accurate. Rather than being directly connected with what is going on in the black box model they are just stories for getting concordant predictions. Given the concordance is not perfect they may well be very misleading in some cases.

    Unfortunately, this supposed fix might be lessening the perceived importance of dispelling the myth. However, a wider awareness of the increasing number of techniques to build simple interpretable models from scratch that achieve high accuracy, may finally dispel the myth of unavoidable complexity in accurate prediction. The techniques are not simple refinements of say linear or logistic regression (by rounding their coefficient to integers which losses accuracy) but involve discernment of appropriate domain based constraints and newer methods of constrained optimization. This results in a spectrum of ease of interpretability of prediction across different applications.

  6. The issue here is obviously about “explaining” AI/ML results to convince people that AI/ML isn’t encoding discriminatory criteria.

    For some people non-discriminatory practices are a bad thing. It’s almost certainly true that AI/ML will be more fair/impartial than humans. That means people who currently benefit by biasing processes through social and political channels won’t get that benefit when those channels are cut off by AI/ML – or that such benefits will need to be stated explicitly and thereby exposed.

    • Anon:

      What makes you say, “It’s almost certainly true that AI/ML will be more fair/impartial than humans”?

      An AI/ML puts the human outside the loop rather than inside the loop, but humans still must decide what loop to use, no? Or they must decide what algorithm is used to choose the loop, or what algorithm is used to choose the algorithm that is used to choose the loop, etc.

      • That AI will be more fair/impartial than humans will be as true for loan apps and employment promotions as it is likely to be regarding autonomous vehicle safety.

        First, a key purpose of designing the AI is to remove the normal sources of human error. We add sensors to the rear part of vehicles even though we don’t have eyes in the back of our heads. We don’t just replace humans with AI, we make AI that has not capabilities than humans.

        Second, one AI can be repeatedly tested and prove to produce accurate results. But no matter how many times you test even a single human, that human can change how it operates at any instant. Multiply that by dozens or even hundreds of individuals making decisions and the likelihood of deviations turns into a question of the number of deviations.

        Third, the output of an AI can be controlled at the highest levels of an organization, where the responsibility lies, meaning it will serve the purposes of the small # if people at the top, unlike hundreds or thousands of individual humans at the bottom of the chain, who would make decisions to serve thier own needs, fairness not withstanding.

        • > That AI will be more fair/impartial than humans will be as true for loan apps and employment promotions

          An AI will be more fair/impartial than a human who hypothetically makes a decision based on the same data, meaning some wide vector in R^n.

  7. Look ahead to AI being used and becoming central to a judgement in a court case. In my view, this will certainly happen, it’s just a question of when. The model and the use made of it will need to be fully justified.

    The likelihood that the judge will have detailed knowledge is perhaps remote; the same with the advocates. So how will a valid decision be made? Is it actually possible for one to be made with two groups of ‘experts’, in simple terms, disagreeing with each other?

    Those early cases will set precedents. Future judgements will use the outcome and the arguments to develop the law further. And this will be irrespective of whether there was a right or wrong answer. What will happen is that a direction of travel will have been set that may well be difficult to overturn.

    AI may well have its place in solving large combinatorial problems, but where the technology is used to influence, or make, decisions that directly affect people…. well, I don’t believe this should be where our use of the technology should be going.

Leave a Reply to Andrew Cancel reply

Your email address will not be published. Required fields are marked *