Skip to content

“Science does not advance by guessing”

I agree with Deborah Mayo who agrees with Carlo Rovelli that “Science does not advance by guessing. It advances by new data or by a deep investigation of the content and the apparent contradictions of previous empirically successful theories.”

And, speaking as a statistician and statistical educator, I think there’s a big problem with the usual treatment of statistics and scientific discovery in statistics articles and textbooks, in that the usual pattern is for the theory and experimental design to be airlifted in from who-knows-where and then the statistical methods are just used to prove (beyond some reasonable doubt) that the theory is correct, via a p-value or a posterior probability or whatever. As Seth pointed out many times, this skips the most key question of where the theory came from, and in addition it skips the almost-as-key question of how the study is designed.

I do have a bit of a theory of where theories come from, and that is from anomalies: in a statistical sense, predictions from an existing model that do not make sense or that contradict the data. We discuss this in chapter 6 of BDA, and Cosma Shalizi and I frame it from a philosophical perspective in our philosophy paper. Or, for a completely non-technical, “humanistic,” take, see my paper with Thomas Basbøll on the idea that good stories are anomalous and immutable.

The idea is that we have a tentative model of the world, and we push that model, and gather data, and find problems with the model, and it is the anomalies that motivate us to go further.

The new theories themselves, though, where do they come from? That’s another question. It seems to me that new theories often come via analogies from other fields (or from other subfields, within physics, for example). At this point I think I should supply some examples but I don’t quite have the energy.

My real point is that sometimes it does seem like science advances by guessing, no? At least, retrospectively, it seems like Bohr, Dirac, etc., just kept guessing different formulations and equations and pushing them forward and getting results. Or, to put it another way, these guys did do “deep investigation of the content and the apparent contradictions of previous empirically successful theories.” But then they guessed too. But their guesses were highly structured, highly constrained. The guesses of Dirac etc. were mathematically sophisticated, not the sort of thing that some outsider could’ve come up with.

How does this relate to, say, political science or economics? I’m not sure. I do think that outsiders can make useful contributions to these fields but there does need to be some sense of the theoretical and empirical constraints.


  1. Rahul says:

    Science does need guessing. Just that they are educated guesses usually, not blind. And often after many dud guesses one guess works. One can hardly deny the role serendipity plays in the progress of Science.

    Although, Andrew does acknowledge this role of “guesswork” at the end of the post. But that only leaves me confused about what the previous parts of the post are trying to get at.

    Evaluating other people’s guesses real-time is subjective. Ones you like are good others are stupid. In the end a guess that leads to a productive extension of the model was a good guess. But I’m not sure what the whole discussion gets us.

    • Keith O'Rourke says:

      I agree, guessing with blind guessing needs to be replaced by educated guessing and as Mayo claims her thinking is in line with CS Peirce’s and guessing as an alternative term for his concept of abduction.

      Here is a characterture of Peirce’s approach in terms of 1) educated guessing 2) confronting experience and 3) coming to an explanation/understanding.

      1. The term guessing seems anomalous here – something like educated guessing would explain it though.

      2. A. If 1, Mayo should agree; B. When Mayo comments she won’t fully agree and express some disagreement.

      3. A. Mayo’s response seemed anomalous in respect “d” but would be explained by “e”.
      B1. Mayo’s comment should have “e+”; B2. It does but also some possible “e-“.
      C1. 1+ would make sense of all of above C21; Should explain 2B; C22. Sort of does; C3. Go for 1+.

      1+ and start all over again with 2+ and 3+, then 1++, .., until stopping tentatively when acceptable level of doubt is reached.

      Some one once described this a Pierce’s dizzying waltz of 123,123,123,…

      • Rahul says:

        In fact, aren’t educated guesses better than just guesses only because that way we can be far more productive? i.e. Solving a maze using some heuristic being faster than just randomly running about everywhere.

        OTOH, were someone to just use blind guessing, and maybe he was just plain lucky, or naive, or persistent enough to spend his entire life on attacking one small problem with uneducated guesswork, should that in any way impugn the quality of the resultant model merely because he used uneducated guesses and not educated ones? (if said model stands the test of empirical data and post hoc reasonable causal explanations)

        • Keith O'Rourke says:

          > Solving a maze using some heuristic being faster than just randomly running about everywhere.
          Yes and Peirce would argue that we evolved to be very good at (educated) guessing (when there is no established basis for good guesses in a problem, unlike the maze example) but admitted he had no strong arguments to justify/confirm that.

          > merely because he used uneducated guesses and not educated ones?
          As JG Gardin put it, you can’t rule out an hypothesis by how it was generated but you certainly may wish to take that into consideration when deciding whether to spend your time entertaining it or trying to replicate it.

          The science game is not about being less wrong _once_, its about accelerating the process of getting less wrong as often (as many questions) as possible (by utilising/being part of a community of enquirers).

        • Nick Menzies says:

          I think there is also the issue that our ability to test the truth of a given hypothesis will always have some false positive rate, and so p(false positive) will be higher with random guessing. Thus educated guessing is also a process of constraining the set of potential hypotheses in order to reduce our overall error rate.

      • mayo says:

        Keith: Where did all this 2 A, d,e come from? I “guess” my post was so long ago, I’ve forgotten, but the interviewee, I recall, was referring to the value of understanding the depth of existing problems and theories as opposed to some kind of distantly constrained speculation. There was a subtle twist to it (that surprised me a little).
        I don’t see Peirce’s abduction as the type of guessing to which the interviewee was objecting because, for starters, that begins with a problem or data to be explained.
        That said, it fails to count as inductive inference for Peirce (as it’s not testing). (A side point.)

        • Keith O'Rourke says:

          Its all fiction, I was just using Andrew’s comment that you agreed with him to speculate how you might react to my comment that you actually did not. It was just a fun way to walk through Peirce’s category theory – sorry if it seemed like something more.

          > it fails to count as inductive inference for Peirce
          Agree, as Peirce was put it, stop and think about an abduction and it is gone. The slightest amount of reflection takes you onto a 2 and 3.

          But ideas come from somewhere and some are better at it than others. As Jim Till (who mentored me early in my career) said once, we were not looking for stems cells but we did notice something unusual and knew not to disregard it.

          • mayo says:

            Keith: Oh, so there’s much more to it than I thought.
            I don’t understand about reflection taking you onto a 2 and 3 (for Peirce). I don’t spoze you’re actually thinking of his “crude induction” (the lowest level), which can then advance to a second or third (more reliable) kind of induction because of an increase in severity?

            I just alluded to you in a current blog comment, but not for any valid reason (only because I was reminded of something Peirce said regarding “before trial”/”after-trial”–which happens also to be relevant to the question of how high a level on the inductive ladder you’ve climbed.

  2. Rahul says:

    “the usual pattern is for the theory and experimental design to be airlifted in from who-knows-where and then the statistical methods are just used to prove (beyond some reasonable doubt) that the theory is correct”

    How else could you do this? Isn’t this a standard problem in a “methods” / “tools” type class? Say, you teach programming or AI or numerical methods your audience will be broad and not have the same pre-requisites to be able to comprehensively understand where the theory came from? At least in any useful way where their prior intuition about the strength of that theory might be relevant.

    Unless you taught highly niche classes e.g. “Statistical Methods for Oncology” this sort of problem will be something you’d just have to live with, right?

    • I don’t know, I think there’s no reason you can’t teach a little Oncology, or Manufacturing Engineering, or Psychology as part of a stats methods course. The main reason it’s not done is that the teacher usually doesn’t know much about those fields. But that’s more or less a bug than a feature. It’s like all the Math profs who hate to teach “calculus for business” or whatever, and would rather go back to their office and think about PDEs on manifolds and whatever.

      • West says:

        And from many a student’s perspective, being required to work out formal proofs of mathematical theorems is a waste of time. I can understand abstraction being the rule for a graduate level differential geometry course but not an intro-to-stats class populated by a wide breadth the freshman class.

        • Daniel Gotthardt says:

          Well, here in Germany you usually reach the other end of the spectrum. Intro to method classes for social scientists avoid any kind of mathematical understanding and that’s not useful either. Yes, formal proofs are probably not necessary for everyone, although it probably would help everyone to have done it at least a few times to get a grasp of abstract reasoning… However, at least I have the impression that the statistical concepts and how the methods/tools work are not taught enough to applied scientists in most fields.

          • Martha says:

            I think Daniel is correct that (as I read him) we need to (at least try to) find the right level of rigor and conceptual understanding to help students learn what is important; requiring mathematical proofs for a freshman intro-to-stats course isn’t helpful for most, but giving a cookbook “because I told you so” type course also has its problems.

            In particular, I think that some discussion of proofs is important in emphasizing that model assumptions are important.

            For example, when I’m teaching undergraduate math majors or a graduate course with mathematical prerequisites, I will show a slide with a proof (in a simple case, and of something I think is important for them to understand — e.g., where independence assumptions come into play) written out, but with reasons omitted, and ask them in class to supply the reasons. This helps focus on the important points, without getting bogged down in less important details.

            For a less mathematically sophisticated audience, I have given a chart(e.g., p. 21 of, showing (again, in a simple case) which model assumptions go into proving which properties that are essential to the hypothesis test or confidence interval procedure. I think (at least hope) that this helps drive home the point that model assumptions are important.

  3. question says:

    Claim A: “Science does not advance by guessing.
    Claim B:”It [Science] advances by new data or by a deep investigation of the content and the apparent contradictions of previous empirically successful theories.”

    This is a really strange quote. Are these supposed to be mutually exclusive? You look at new data, old data, and where other ideas seem to have merit or shortcomings then you guess and test.

    The fancier word for “guess” is “abduce”:
    The American philosopher Charles Sanders Peirce (1839–1914) first introduced the term as “guessing”.[7] Peirce said that to abduce a hypothetical explanation a from an observed circumstance b is to surmise that a may be true because then b would be a matter of course.[8] Thus, to abduce a from b involves determining that a is sufficient, but not necessary, for b.”

    The science motto:
    Nullius in verba. Explore, Abduce, Deduce, Test.

    • Keith O'Rourke says:

      Question: This came through after I posted other wise I would have linked to it.

    • Martha says:

      Question said “The fancier word for “guess” is “abduce”. This sounds like an oversimplification — gives this quote from Peirce’s “On the Logic of Drawing History from Ancient Documents Especially from Testimonies”:

      “Accepting the conclusion that an explanation is needed when facts contrary to what we should expect emerge, it follows that the explanation must be such a proposition as would lead to the prediction of the observed facts, either as necessary consequences or at least as very probable under the circumstances. A hypothesis then, has to be adopted, which is likely in itself, and renders the facts likely. This step of adopting a hypothesis as being suggested by the facts, is what I call abduction. I reckon it as a form of inference, however problematical the hypothesis may be held. What are to be the logical rules to which we are to conform in taking this step? There would be no logic in imposing rules, and saying that they ought to be followed, until it is made out that the purpose of hypothesis requires them. [—] Ultimately, the circumstance that a hypothesis, although it may lead us to expect some facts to be as they are, may in the future lead us to erroneous expectations about other facts, – this circumstance, which anybody must have admitted as soon as it was brought home to him, was brought home to scientific men so forcibly, first in astronomy, and then in other sciences, that it became axiomatical that a hypothesis adopted by abduction could only be adopted on probation, and must be tested.”

      This is much more subtle and specialized than the general word “guessing” suggests; yes, abduction could be considered a form of guessing, but it is a very constrained form of guessing.

      • question says:


        From your link:
        “There is a more familiar name for it than abduction; for it is neither more nor less than guessing.”

        You seem to explicitly disagree with Peirce on the definition of guessing. I have to admit I do not understand the OP quote at all, so there is probably some difference in how people use that word “guessing”. “Guessing” does not have a negative connotation for me.

        • Martha says:


          OK — and I see another Peirce quote on the same page saying, “The first starting of a hypothesis and the entertaining of it, whether as a simple interrogation or with any degree of confidence, is an inferential step which I propose to call abduction. This will include a preference for any one hypothesis over others which would equally explain the facts, so long as this preference is not based upon any previous knowledge bearing upon the truth of the hypotheses, nor on any testing of any of the hypotheses, after having admitted them on probation. I call all such inference by the peculiar name, abduction, because its legitimacy depends upon altogether different principles from those of other kinds of inference,”
          which supports your earlier comment that abduce is just a fancier word for guess.

          I wondered if my interpretation of Peirce might have been influenced by Bookstein, so looked up his definition. He *quotes* (p. xxviii) Peirce as saying,

          “The first starting of a hypothesis and the entertaining of it, whether as a simple interrogation or with any degree of confidence, is an inferential step which I propose to call abduction [or retroduction]. I call such inference by the peculiar name, abduction, because its legitimacy depends upon altogether different principles from those of other kinds of inference. The form of inference is this:
          The surprising fact, C is observed;
          But if A were true, C would be a matter of course,
          Hence, there is reason to suspect that A is true.”

          This quote is dated 1903 (cited from p. 151 of Buchler, ed, The philosophy of Peirce, selected writings, 1940), whereas the three quotes (from you and me) from the website I cited were all from 1901. So this suggests that Peirce may have refined his definition between 1901 and 1903.

          Scrolling further down the website, I could not find Bookstein’s quote, but did find something close:

          “1903 | Harvard Lectures on Pragmatism: Lecture VII | CP 5.188-189

          It must be remembered that abduction, although it is very little hampered by logical rules, nevertheless is logical inference, asserting its conclusion only problematically or conjecturally, it is true, but nevertheless having a perfectly definite logical form.

          Long before I first classed abduction as an inference it was recognized by logicians that the operation of adopting an explanatory hypothesis – which is just what abduction is – was subject to certain conditions. Namely, the hypothesis cannot be admitted, even as a hypothesis, unless it be supposed that it would account for the facts or some of them. The form of inference, therefore, is this:

          The surprising fact, C, is observed;
          But if A were true, C would be a matter of course,
          Hence, there is reason to suspect that A is true.

          Thus, A cannot be abductively inferred, or if you prefer the expression, cannot be abductively conjectured until its entire content is already present in the premiss, “If A were true, C would be a matter of course.””

          In this, Peirce seems to be saying that he uses “abduction” to mean more or less what the mathematical use of “conjecture” means.

          But the upshot also seems to be that maybe we can’t be sure just what Peirce said — he seems to have refined his notion between 1901 and 1903, and different transcribers may give somewhat differing transcriptions.

  4. Fernando says:

    Science starts from observation and experience. From there, everything follows.

  5. Mark Palko says:

    When I saw the title, I assumed this was the long-awaited Pólya post.

  6. Jonathan says:

    Hard to say this on a phone but they aren’t guesses as much as definition of what does not work opening a path you would otherwise not take. As in repeated failure to find an aether defines another choice, constant light speed. That’s a frustration with the file drawer problem: we lose the definition of what is not and so shoot arrows into the dark hoping to hit a target which might better even only be seen by what it is not.

    Take Feynman. His work was a specific response to inability to calculate in certain cases. It defined a method because that was what was left by what was not found and not working.

    A key skill seems to be the ability to give up looking for the solution being where you think it should be and accepting that where it is not is where it actually is. I wrote that sentence carefully. Problems define what they are and what they are not and some are solved directly by answering what they are and others, typically the bigger or more difficult problems, can only be answered by surrender to failure and acceptance that we’ve defined what is not the path and that whatever is left is the way to the solution, if you can be that creative and can render your answer rigorously.

  7. Clyde Schechter says:

    Not all scientific progress results from an effort to grapple with anomalies in the data. Sometimes progress comes from an insight that unifies already recognized principles into something more parsimonious, or deeper, or having more explanatory power. Maxwell’s equations are an example of this: a unified explanation of electricity and magnetism that also explains light.

  8. ezra abrams says:

    For the last 50 years, molecular biology/genetics/development is a target rich area; there are lots of interesting things to do.
    A lot has to do with persistence and being in the right place, and having the right tools
    iirc, A Kornberg, Nobel Laureate and one of the top molecular biologists of the 20th century, the man who figured out how, at a molecular level, cells make more DNA [1] said the key to his 1st experiments, iirc at case western, was that he needed radioactive molecules, and the only guy in the country who could make them was down the hall…

    Kornberg went to Stanford, where he built Stanford Medical School into a superstar place
    I was at MIT in, iirc, the mid 90s, when Kornberg came to give a seminair. The large hall was packed; this wa *arthur kornberg*
    About halfway thru, he stops and says, last time I was here was for my job seminair 50 years ago. The [MIT] chair didn’t think much of my ideas
    He then went on with his seminair, but the messsage was just like in pretty woman: mistake; big mistake

    which brings up an important point; these guys aren’t nobel prize winners just cause they are smart, and lucky, and hardworking: they are really, really competitive (when R Yalow got the prize for RIA, the story was, she would get off the plane, go home and cook her husband a kosher meal, then around 10pm go back to the lab…)

    1] I simplify; see his book DNA replication

  9. artkqtarks says:

    “Science does not advance by guessing. It advances by new data or by a deep investigation of the content and the apparent contradictions of previous empirically successful theories.”

    I think this is a rather biased view because Rovelli is a physicist and especially a theorist in a rather narrow sub-discipline within physics. It presupposes previous empirically successful theories/models that are able to give precise predictions that are testable. If you are dealing with more complex system, whether condensed matters or living organisms, the model could be too complex to easily give predictions or already approximation/abstraction of something more realistic (a spherical cow). In principle, what we see in our daily life should be explained by (or at least don’t contradict with) the standard model of particle physics, but that is of little help to condensed matter physicists, chemists, or biologists. In biology, you may not really have a model other than to say that some molecules and genes are involved in this.

    Discovery of a new phenomenon is an impetus for new models/theories not necessarily because it contradicts the previous theory, but often because no one thought about it. Coming up with a new model takes some creativity and I’d say that include guessing. Or you can try to do some kind of screening to find something with very little guessing as geneticists do.

    As ezra abrams mentioned above, having the right tools is important because that allows you to see something that you are otherwise unable to see. That helps both discovery of a new phenomenon and finding the clues for the right model.

  10. DK says:

    Not only science does advance by guesses – it also does advance by shoddy experimentation! Here is how it typically works in modern biology where clean experiments are difficult: lots of grant-writing PIs try to come up with good guesses (because of the grants distribution system, these guesses usually have to conform to the prevailing set of mind; but that’s a different story). Then, naturally, the PI wants his students to “confirm” the guess (it is typically called a “model” at this point). So some student, typically quite technically incompetent because no one really can afford investing any real time into his training (AND because in the end it is actually beneficial to have only semi-competent experimentalists) does ten experiments that produce three different results. As long as one of them conforms to the model – bingo!!! There are always good reasons why the other experiments “failed”, so the paper is written and the model if being put forward. And so it goes with almost everone. Vast majority of experiments can never be reproduced and most models ultimately prove to be wrong by the preponderance of evidence obtained by the massive and torturously inefficient network of somewhat smart guesses and seriously flawed experimentation. Still, the process eventually crystallizes the truth by way of elimination (after all, certain percent of the experiments is done right!) All of this is reminiscent of what Feyerabend has advanced even though he could never have imagined the nightmare that today’s science will become: anything goes, there is no method, victors are not judged.

Leave a Reply