Skip to content

Google on Responsible AI Practices

Great and beautifully written advice for any data science setting:



  1. Ben Hanowell says:

    Prediction: With greater-than-even probability, Frank Harrell will not like the section that reads: “false positive and false negative rates sliced across different subgroups.”

    • Andrew says:


      I actually have no idea what they mean by “false positive and false negative rates” in that context.

      • Let’s take a simple case, where a false positive is offering a loan to someone who will default and a false negative is not offering one to someone who would pay it back. The goal would be to make sure we’re not introducing extra bias for some subgroup of the population.

        But forget about false positve and false negative—that’s just one error criterion for when you make binary decisions. The bigger idea is really just evaluating different groups and interactions in multilevel regressions rather than simply looking at aggregate results, which might not show up biases in subgroups.

    • Keith O'Rourke says:

      It does look like a public relations piece – clicking through gets you to actual papers e.g. I found this

      The “sliced across different subgroups” seems to be a warning not rely on on global assessments of say prediction (like the NN that had the lowest mse predicting toxicity by often missing highly toxic but doing very well on lower toxicity).

      • Anoneuoid says:

        That sounds more like using the wrong evaluation metric. They didn’t actually want to optimize for low mse since underestimating high toxicity is more dangerous, so why did they train their model to do that?

  2. Thanatos Savehn says:

    “The way actual users experience your system …”

    Who are the non-actual users and what do they experience when they meet your system?

    Is this philosophy of statistics or the statistics of philosophy?

    • jrkrideau says:

      Non-actual users are the three geek friends in the next cubicles who give it a whirl and say “Cool”.

      While not actual aliens, their knowledge and reactions have no relation to an actual user.

    • Ricardo says:

      I read it just as an emphasis expression, something like: the specific type of people that experience your system. Not as a deep philosophical statement.

    • The non-actual users can be simulated users that you create while training the system. Let’s say I’m building a search system. I’ll simulate user queries and collect them for evaluation. They may or may not be representative of what actual users do when they hit the system in the wild. jrkrideau brings up the intermediate step of the friends in the next cubicles, who, like the simulated test set, may not be represented of the intended target audience for an application.

  3. Roy says:

    I assume this is how Google actually implements responsible policy:

    Being old and cynical as I am, one has to wonder if for Google, and other large similar companies. if the terms “responsible” and “AI” (or “ML”) aren’t oxymorons

  4. Rahul says:

    I’d be curious to hear of a project or feature that google axed because it didn’t pass this policy. i.e. Is this actionable or only lip service?

Leave a Reply