How should Bayesian methods for the social sciences differ from Bayesian methods for the physical sciences? (my talk at the Bayesian Methods for the Social Sciences workshop, 21 Oct 2022, Paris)

The title of the workshop is Bayesian Methods for the Social Sciences, so it makes sense to ask what’s so special about the social sciences.

At first I was thinking of comparing the social sciences to the natural sciences, but I think that social sciences such as political science, economics, and sociology have a lot in common with biological sciences and medicine, along with in-between sciences such as psychology: all of these are characterized by high variability, measurement challenges, and difficulty in matching scientific theories to mathematical or statistical models.

So I decided that the more interesting comparison is to the physical sciences, where variability tends to be lower, or at least better behaved (for example, shot noise or Brownian motion). In the social sciences, statistical models—Bayesian or otherwise—have a lot more subjectivity, a lot more researcher degrees of freedom. In theory, Bayesian inference should work for any problem, but it has a different flavor when our models can be way off and there can be big gaps between actual measurements and the goals of measurements.

So I think there will be lots to say on this topic. I’m hoping the conf will be in French so that I’m forced to speak slowly.

10 thoughts on “How should Bayesian methods for the social sciences differ from Bayesian methods for the physical sciences? (my talk at the Bayesian Methods for the Social Sciences workshop, 21 Oct 2022, Paris)

    • Astrophysics and particle physics also have different kinds of data. Andrew’s emphasizing the high variability and measurement challenge aspect of the data in social science in the post and claiming it’s less variable in the physical sciences. I’m not sure I buy that. I was blown away by this post by John Cook on planetary orbits are very nearly circular, which explains how Kepler took Tycho’s (apparently very accurate measurements) and worried about a discrepancy of 0.037% (!) from a circular orbit (Cook’s whole sequence of posts on this topic is fascinating).

  1. One of the many ways you could say there are “two kinds” of scientific inquiry is between those whose data is directly measured or observed (subject of course to measurement error) versus those whose data is largely comprised of things that can only be assessed indirectly (subject to measurement error plus lack of validity).

    In the data I’ve worked with over my career the almost universal shortcoming has been lack of discriminant validity. You’ve got a measure that seems to correlate well enough with the specific construct you claim to be modeling but unfortunately it also correlates pretty well with several other related constructs that you’re treating as distinct.

    Or you have separate measures for three or four highly specific, carefully defined theoretical or latent constructs but once the data is gathered there’s evidence for the possibility all those measures are simply reflecting some vaguer, more general construct and not the picky little angels on the head of a pin distinctions your theory is based on.

    That’s the kind of stuff statistical models of physical processes seldom have to deal with.

    • “One of the many ways you could say there are “two kinds” of scientific inquiry is between those whose data is directly measured or observed (subject of course to measurement error) versus those whose data is largely comprised of things that can only be assessed indirectly (subject to measurement error plus lack of validity).”

      I don’t mean to argue against this point, but when theories (I’m hesitant to call them models) fail in the physical sciences, it is rarely due to accumulated measurement error. Measurement error is generally a problem that can be solved. Physical science theories get proved wrong for one of two reasons, poverty of the imagination or extrapolation inaccuracy. In the former, the scientist was unable to imagine what was really happening, and in the latter, behavior inside an operating envelope was inaccurately extended beyond the measured range due to unanticipated nonlinearity. These problems have little to do with signal-to-noise ratios or statistical data interpretations.

      I’m at a loss to imagine how Bayesian reasoning could affect any of this, wish I could attend the talk!

      • What’s the difference between a theory and a model? I would say Newton introduced a model of gravity, though there was never a “direct measure” of gravity. Einstein did the same with a more refined relativistic model, which was indeed motivated by discrepancies in Newton’s model. My understanding is that we’re still evaluating aspects of Einstein’s theoretical model empirically.

        We still don’t know the universal gravitational constant very accurately (4 digits according to Wikipedia. So this is apparently a problem where measurement error has not been fully solved.

        We also don’t know the masses of neutrinos. Check out this paper by Joseph Fromaggio et al. from 2022 (or many of his other papers), which says

        Though the existence of neutrino mass is now firmly established experimentally, the mass scale itself is still unknown, and remains an outstanding question in the field of experimental neutrino physics.

        Fromaggio et al. use Bayesian statistics to deal with the measurement error induced by indirect measurements. We know Joe because he’s used Stan for inference in this project.

        You may recall from Andrew’s earlier post that the LIGO project is also using Bayesian statistics (including Stan!) in modeling gravitational wave data.

        • Bob wrote:

          “We still don’t know the universal gravitational constant very accurately (4 digits according to Wikipedia. So this is apparently a problem where measurement error has not been fully solved.”

          This actually reinforces the point I was making. We only need the extra digits when we are trying to figure something else out. For example, we need the gravitational constant when we sling a spaceship towards another planet. But we don’t need those extra digits. We sling the vessel close to the planet and then finish with retro rockets. And since the vessel got close, we have once again proven that we have an adequate estimate of the gravitational constant. “Adequacy” is the key concept, not precision.

          Daniel wrote:

          “Look at the melting temperature of sucrose… For decades people had conflicting measurements of it. I remember reading recently that very careful observation of heated sucrose showed that it doesn’t have a melting temperature, it goes through a whole complex process as you heat it there is no precise point at which it transitions from solid to liquid.”

          This reinforces my other point about poverty of the imagination.

      • One problem that comes up more often in the social sciences is validity of construct. So for example we can talk about velocity in physics without worrying that there is no such thing and the stuff we measure when we measure velocity is actually something called “motionocity” which only kind of looks like velocity if you don’t look carefully.

        Something like “Primary psychopathy” is more squirrelly, as is for example “real GDP”

        But that doesn’t mean validity isn’t a problem in physical sciences. Look at the melting temperature of sucrose… For decades people had conflicting measurements of it. I remember reading recently that very careful observation of heated sucrose showed that it doesn’t have a melting temperature, it goes through a whole complex process as you heat it there is no precise point at which it transitions from solid to liquid.

        Or consider the problems of predicting the weather. There are many many heuristic shortcuts in the physics being simulated. For example cloud formation, or rain initiation, or turbulence…. At the scale the mm models are written which is like tens of kilometers between cells, the behavior of the fluid doesn’t really get well modeled by Navier Stokes equations with known measured values for the properties of mois air etc. The validity of say “viscosity” or even “gas” is highly suspect. What if it’s a gas with a bunch of water droplets being buoyed up by drag?

  2. I think a crucial difference is that physical sciences are cumulative is ways that social sciences are not. For example, a new measurement of a quantity in nuclear physics is typically twice as precise as the best previous measurement. I doubt social sciences do better than medical sciences, where a new measurement of a quantity typically has 50% larger uncertainties than the best previous measurement of the same quantity. Rephrasing Rutherford’s aphorism: if you are arguing about statistics in physical sciences, you can often just do a better experiment.

Leave a Reply

Your email address will not be published. Required fields are marked *