A world without statistics

A reporter asked me for a quote regarding the importance of statistics. But, after thinking about it for a moment, I decided that statistics isn’t so important at all. A world without statistics wouldn’t be much different from the world we have now.

What would be missing, in a world without statistics?

Science would be pretty much ok. Newton didn’t need statistics for his theories of gravity, motion, and light, nor did Einstein need statistics for the theory of relativity. Thermodynamics and quantum mechanics are fundamentally statistical, but lots of progress could’ve been made in these areas without statistics. The second law of thermodynamics is an observable fact, ditto the two-slit experiment and various experimental results revealing the nature of the atom. The A-bomb and, almost certainly, the H-bomb, maybe these would never have been invented without statistics, but on balance I think most people would feel that the world would be a better place without these particular scientific developments. Without statistics, we could forget about discovering the Hibbs boson etc, but that doesn’t seem like such a loss for humanity.

At a more applied level, statistics helped to win World War 2, most notably in cracking the Enigma code but also in various operations-research efforts. And it’s my impression that “our” statistics were better than “their” statistics. So that’s something.

Where would civilian technology be without statistics? I’m not sure. I don’t have a sense of how necessary statistics was for quantum theory. In a world without statistics, would the study of quantum physics have progressed far enough so that transistors were invented? This one, I don’t know. And without statistics we wouldn’t have modern quality control, so maybe we’d still be driving around in AMC Gremlins and the like. Scary thought, but not a huge deal, I’d think. No transistors, though, that would make a difference in my life. No transistors, no blogging! And I guess we could also forget about various unequivocally beneficial technological innovations such as modern pacemakers, hearing aids, cochlear implants, and Clippy.

Modern biomedicine uses lots and lots of statistics, but would medicine be so much worse without it? I don’t think so, at least not yet. You don’t need statistics to see that penicillin works, nor to see that mosquitos transmit disease and that nets keep the mosquitos out. Without statistics, I assume that various mistakes would get into the system, various ineffective treatments that people think are effective, etc. But on balance I doubt these would be huge mistakes, and the big ones would eventually get caught, with careful record-keeping even without statistical inference and adjustments. Without statistics, biologists would not be able to sequence the gene, and I assume they’d be much slower at developing tools such as tests that allow you to check for chromosomal abnormalities in amnio. I doubt all these things add up to much yet, but I guess there’s promise for the future. Statistics is also necessary for a lot of drug development—right now my colleagues and I are working on a pharmacodynamic model of dosing—but, again, without any of this, it’s not clear the world would be so much different.

The Poverty Lab team use statistics and randomized experiments to see what works to help the lives of poor people around the world. That’s cool but I’m not ultimately convinced this all makes a difference in the big picture. Or, to put it another way, I suspect that the statistical validation serves mostly as a way to build political consensus for economic policies that will be effective in sharing the wealth. By demonstrating in a scientific way that Treatment X is effective, this supports the idea that there is a way to help the sort of people who live in what Nicholas Wade would describe as “tribal” societies. So, sure, fine, but in this case the benefits of the statistical methods are somewhat indirect.

Without statistics, we wouldn’t have most of the papers in “Psychological Science,” but I could handle that. Piaget didn’t need any statistics, and I think the modern successors of Piaget could’ve done pretty much what they’ve done without statistics, just by careful observation of major transitions.

Careful observation and precise measurement can be done, with or without statistical methods. Indeed, researchers often use statistics as a substitute for careful observation and precise measurement. That is a horrible thing to do, and if you have a clear understanding of statistical theory, you can see why. But statistics is hard, and lots of researchers (and journal editors, news reporters, etc.) don’t have that understanding. When statistics is used as a substitute for, rather than an adjunct to, scientific measurement, we get problems.

OK, here’s another one: no statistics, no psychometrics. That’s too bad but one could make the argument that, on the whole, psychometrics has done more harm than good (value-added assessment, anyone?). Don’t get me wrong—I like psychometrics, and a strong argument could be made that it’s done more good than harm—but my point here is that the net benefit is not clear; a case would have to be made.

Polling. Can’t do it well without statistics. But, would a world without polling be so horrible? Much as I hate to admit it, I don’t think so. Don’t get me wrong, I think polling is on balance a good thing—I agree with George Gallup that measurement of public opinion is an important part of the modern democratic process—but I wouldn’t want to hang too much of the benefits of statistics on this one use, given that I expect lots of people would argue that opinion polls do more harm than good in politics.

The alternative to good statistics is . . .

Perhaps the most important benefits of statistics come not from the direct use of statistical methods in science and technology, but rather in helping us learn about the world. Statisticians from Francis Galton and Ronald Fisher onward have used statistics to give us a much deeper understanding of human and biological variation. I can’t see how any non-statistical, mechanistic model of the world could reproduce that level of understanding. Forget about p-values, Bayesian inference, and the rest: here I’m simply talking about the nature of correlation and variation.

For a more humble example, consider Bill James. Baseball is a silly example, sure, but the point is to see how much understanding has been gained in this area through statistical measurement and comparison. As James so memorably wrote, the alternative to good statistics is not “no statistics,” it’s “bad statistics.” James wrote about baseball commentators who would make asinine arguments which they would back up by picking out numbers without context. In politics, the equivalent might be a proudly humanistic pundit such as New York Times columnist David Brooks supporting his views by just making up numbers or featuring various “too good to be true” statistics and not checking them.

So here’s one benefit to the formal study of statistics: Without any statistics, there still would be numbers, along with people trying to interpret them.

Could governments and large businesses be managed well without statistics? I’m not sure. Given that half the U.S. Congress seems willing to shut down the government from time to time, it’s not clear than any agreement on the numbers will have much to do with political action. Similarly, all the statistics in the world don’t seem to be stopping the euro-zone from drifting. But maybe things would be much worse without a common core of statistical agreement. I don’t know; unfortunately this seems like the sort of causal question that is too difficult for statistics to answer.

Finally, one way that statistics is potentially having a huge impact in our lives is through the measurement of global warming and all the rest. But I’m guessing that a lot of this could be done with a pre-statistical understanding. The basic physics is already there, as would be the careful measurements. Statistical modeling is certainly relevant to the study of climate change—if you’re trying to reconstruct historical climate conditions from tree-ring data, it’s tough enough to do it with statistical modeling, I can’t imagine how it could be done otherwise—but the basic patterns of carbon dioxide, temperature, melting ice, etc., are apparent in any case. And, even with statistics, much uncertainty remains.


When I started writing this post, I was thinking that statistics doesn’t really matter, but I think that’s because I was focusing on some of the more highly-publicized but less beneficial applications of statistics: the use of statistical experimentation and inference to get p-values for tabloid-bait scientific papers, or for Google, Amazon, etc., to perfect their techniques for squeezing money out of their customers or, even at best, to test a medical treatment that increases survival rate for some rare disease by 2 percentage points. But statistics is central to how we think about the world. I still think that statistics is much less central to our lives than, say, chemistry. But it ain’t nothing.

197 thoughts on “A world without statistics

  1. Why does polling need Statistics? If you just allow everyone to vote, all you need is arithmetic, right?

    Among uses, did you forget insurance? It’s hard to do good insurance without using statistics in some form, explicitly or implicitly.

    Also, fraud detection? Statistics helps us catch crooks that may otherwise be a drag on society.

    A questionable use, but perhaps good, is the running of casinos & lotteries. Statistics might help your business not go bust there.

    A lot of modern financial instruments including HFT seem grounded in statistics. Again one may disagree about their utility. Even something basic like the formula for pricing a derivative seems statistically derived.

    • On insurance, non-life insurance got by quite happily without any sort of stats whatsoever until the actuaries got involved. On the life side, although one can argue that much actuarial maths is statistical, it was (and to a certain extent still is) treated as a deterministic system with little attention paid to variation (quite a lot of actuarial science predates the modern treatment of probability).

  2. What the world would be like without statistics would depend, in part, on what “statistics means”. Having a language that allows uncertainty seems important and that this is applied to science also seems important. How specifically this is done is less important, so Piaget didn’t need a t-test, but he needed to recognize variation in growth.

  3. It is certainly an interesting thought experiment.

    I think it becomes difficult to describe, as you somewhat imply, because it depends really on how you define “statistics.” The effect of, say, retroactively removing the ability to calculate Chi-square tests from contingency tables (or whatever other formal statistical method you want to talk about) really wouldn’t be that significant. I think, at best, you could make the argument that without this formal framework a number of scientific discoveries would have been more difficult or taken longer to figure out.

    On the other hand, statistics is more than just a set of formal tests, it is a philosophical approach to inference. And in a lot of ways there is a very fine line between statistics and just plain math. For example, even though it isn’t formally a statistical method, the practice of celestial navigation (through the use of sextants and such) is based on taking a number of discrete numerical measurements and using them to make an inference about relative location, and within this field there are ways of handling or controlling for different forms of uncertainty. I would consider this philosophically in line with statistics, even if it isn’t technically statistical in nature.

  4. Imagine no statistics
    It’s easy if you try
    No p values below us…
    Above us only sky
    Imagine all the researchers
    Interpreting numbers today…

    Imagine no publications
    I wonder if you can
    No need for tenure…
    A brotherhood of man
    Imagine all the professors
    Sharing all the world…

    You may say I’m a dreamer
    But I’m not the only one
    I hope someday you’ll join us
    And the world will live as one

  5. The scientific method is based on “statistics”–that is, you make a change in some input variable and observe the result and compare it with the unchanged input situation. The outcome after the change is almost always variable (so, statistical). It’s difficult to imagine any real progress in science without the scientific method. The argument given in the blog is really about “What if there was no formalization of statistics?”–that could probably be dispensed with (formalization being the p-values, decision rules, underlying mathematical theory, etc).

    • I think that scientists would assess variation intuitively. So statistics would be effectively replaced by intuitive statistics (which is what we use to define priors).

      Now, what a world without intuitive statistics or any grasp of variation at all would be like?

      • I think the concept people have the most problem with is what is now being called “concentration of measure”. That is, laws of large numbers, central limit theorem, etc. Individuals who understood this concept intuitively would make better scientists (less likely to believe something is true with small n and an expected effect–look at all the psych experiments Andrew is always trashing). Of course, in some fields, having an intuitive knowledge of variation would hurt you–it would prevent you (at least, if you were intellectually honest) from committing what is now being called “soft fraud” (that is, you don’t fake the data, or the results, you just interpret them however you wish without concern for coherence or contradictions).

    • Whenever anyone mentions “the scientific method”, two things come to mind:

      1. A sentence in a short publication put out a number of years ago by the Howard Hughes Institute designed to encourage young people to enter science that read, “There are probably as many versions of the scientific method as there are scientists.

      2. A committee (consisting of one physicist, two biologists, two geologists, one chemist, one mathematician, and one person who taught science to inservice elementary teachers) that I was on a few years ago that was tasked with designing a science course for pre-service elementary teachers. It sure gave lots of evidence of the assertion in item 1. (But the discussion was fascinating.)

      • Replace “scientific method” with “experimental method”, then. One of the reasons the social sciences is so hard is that it’s very hard to do experiments (though Facebook is starting to change that).

        • It’s hard (often impossible) to do experiments in geology, astronomy, some areas of biology, and some areas of physics. So the social sciences could probably learn a lot about doing science from these areas.

  6. “At a more applied level, statistics helped to win World War 2, most notably in cracking the Enigma code but also in various operations-research efforts. And it’s my impression that “our” statistics were better than “their” statistics. ”

    Actually my impression is that WW2 was won by throwing overwhelming men and material at the Axis. I forget which historian said this, but the wonder is not that Germany lost, but that it held out against the rest of the world for as many years as it did. Twice.

    Anyway, more to the point: I too have come across the meme that “statistics won WWII” (or OR, or some variant).
    Any pointers to a thoughtful analysis of the role of the statistical method in WWII? Most works I have seen are too science boosterish for my taste.

    • Well yes. Always check things like relative GNP and population before attacking someone. But wait, those are statistics too Desciptive not inferential but still stats.

      However I think there is a good argument that statistics contribued to shortening the war. Certainly work by people like Alan Turing and his associates at Bletchly Park on the German navel codes helped reduce the possibility of defeat if not win the Battle of the Atlantic.

      And some of the operations research work paid off quite well I believe. Not, probably as decisively as Enigma but I am not really up on this.

      The likely reason they held out so long seems to be that qualitively the German army was better on man for man basis than any other.

      Germany might have held out much longer if Hitler was not so insane but he essentially doomed Germany when he invaded the USSR.

      • Big mistake was attacking the USSR _before_ finishing off England (only 2/3rds the German Army could be used in the East where the Soviets were able to use 90% of the Red Army against the Germans). As far as Enigma, the British told the Soviets the starting date of Barbarossa (June 22, 1941–Stalin dismissed this), and also let them know the starting date of Operation Citadel (the 1943 attack on the Kursk salient–Stalin believed this one). Radar was probably the one single most important technological factor in the war (simply because it allowed the UK to win the Battle of Britain, which kept Britain in the war, which meant the Germans couldn’t use their entire army against Russia). RADAR requires the measurement of elapsed time between signal propagation and return, so this is a statistical problem. But solving it does not require a knowledge of the mathematical theories underlying statistics.

    • Actually the first efforts at cracking Enigma were led by Polish intelligence. After invasion of Poland the latter passed info to Britain.

      One area where cracking enigma proved super useful was in sinking German submarines blocking Atlantic supply routes. And in sinking supply vessels for Rommel in North Africa. Without these war in Europe could have lasted as long as in the Pacific and, who knows, maybe given time for Germany to develop A bomb, though I doubt it.

  7. This is a good post but I differ on two points.

    With respect to baseball analysis, I agree that James helpe advance the cause via a number of essays/studies in which he showed that certain trends were not really trends at all upon examination at some larger scale, but were variations from a central tendency. However, I think he also confused the issue by equating the term “luck” with “random variation”. In a skill endeavor like baseball, that introduces a big time hornet’s nest, rooted in just exactly what those two terms really entail. I’ve had no end of frustrating arguments on the issue with folks who just buy into whatever James says, because well, he’s Bill James.

    The more important difference is with climate change. In my view, solid statistical analyses are absolutely vital to that field, and a good number of problems have arisen due to failures along those lines. Although some physical laws are indeed known with pretty high confidence, as you state, most notably the radiative forcings of important greenhouse gases, everything beyond that has a large degree of uncertainty attached to it. When you start asking questions like how much of the observed GMST change since some baseline period is due to GHGs, or human drivers more broadly, you are immediately well beyond the realm of confidence implied by the knowledge of GHG forcing values alone. You’re into complex cause and effect systems with a lot of uncertainty and a whole lot of noise.

    As for tree rings and climate estimates specifically, those have even more fundamental problems which include gross oversimplifications of mathematical and statistical issues related to the fundamental form of the biological response to temperature (Unimodal NOT linear!), and with how to identify, and mathematically remove, inherently biological trends from *overall* growth trends (which have both biological and non-biological components). No statistical analysis of any kind can compensate or correct for these fundamental mathematical mistakes.

  8. The origin of Statistics is tied to the government’s desire to measure , tax and control the people.

    Statistics (sometimes) provides an illusion of understanding. (Hayek called this “synoptic delusion”).

    For example: Would the Soviet five year plans with their obsessive focus on hitting targets for everything from Steel to Shoes been possible without the intellectual apparatus (and cover) provided by Statisticians? (P.C.Mahalanobis, India’s uber planner, was also a statistician).

    Can we attribute at least some of the misery caused by Central planning to a Cultish use of Statistics?

    I think we can. A very large share.

    • Statisticians have been on the end of this misery. Those who supervised the Soviet Census of 1939 were executed (their numbers–accurately–reflected at least a million deaths from the purge of 1936-38). And, of course, the Italian statisticians on trial for failing to predict an earthquake. Happens to other professionals also–Soviet scientists executed in 1937 for “concealing ore deposits for the arrival of the Germans”–that is, they failed to find it. But the cultish use of statistics goes back to Biblical times–Jesus was born in Bethlehem because the Roman world was being taxed and Joseph and Mary had to go there for enumeration purposes (now, in actual fact, that probably didn’t happen–prophecy dictated that the Messiah would be born in Bethlehem and so the gospels probably made this up, but for the point of this argument all that matters is that it was plausible enough a rationale to make it as a justification for the location of Christ’s birth).

      • A popular history of statistics, The Lady Pouring Tea, notes that statisticians in Stalin’s Soviet Union were inherently ideologically suspect because they measured variation or error, which in the planned economy of the worker’s state was supposed to be a relic of the bad old days of capitalism and inequality.

        • It’s the Lady *Tasting* Tea and I was thinking about it as well while reading the article. The book describes several developments in statistics that came about while working on real-world problems, notably in agriculture, weather and beer-making. Recommended reading to be sure.

  9. I like that post very much. Gives us something to think.
    Haven’t thought about statistics this way for long.

    By the way, since Chuck already’s written some lyrics
    about the topic, may I cite Mr. Wang Waowei from China,
    who according to the Wall Street Journal (2009) wrote:

    Love the Motherland, Love Statistics

    In Life
    Some mock me for doing statistics
    Some loathe me and statistics
    Some don’t understand what statistics are
    Why is it that statistics
    Put a calm smile on my face?
    Because of statistics I can solve the deepest mysteries
    Because of statistics I will not be lonely again, playing in the data
    Because of statistics I can rearrange the stars in the skies above

  10. I think you have articulated some good, high-level observations about the limited impact that descriptive statistics has had. When I say descriptive statistics, I am referring to the focus on estimating probabilities and getting the confidence/credible intervals just right. From my perspective as a scientist, it is conspicuous how much effort in the field of statistics is not oriented towards discovery, but (stating my view prejudicially) just towards “getting the standard errors right.”

    Regarding probabilistic causal modeling, however, I see things quite differently. I imagine that your world view about causal modeling is embedded in the social sciences. I understand that when the subject species is Us and our behavior, disentangling causal relations may seem like a maddening enterprise or even a lost cause to some. As someone in the natural sciences, however, I think we are making big discoveries quite commonly now by evaluating hypotheses about causal networks of relationships using structural equation modeling. Of course it is the failure of the original models and discovery of omitted links in the networks that lead to further studies/measurements and ultimately discovery. I am not at all naïve about the chances for any given model result to be wrong in major ways, but the sustained application of SEM principles over a series of studies that is yielding big returns and many (not all) of these are verifiable. This harnessing of statistical power in probabilistic causal networks is where we are seeing the big payoffs (in my view).

    • Even though I don’t fully understand the math mechanics of path analysis and SEM, I do understand their purpose, and it often seems to me to be sort of the pinnacle of estimating cause and effect relationships in complex systems with many interacting variables, via probabilistic analysis.

    • I disagree. Without statistics, we would be inundated with at least 10 times the number of poorly conceived and ill-justified papers, and it would be even more difficult to sort out the valid from the invalid claims.

      Statistics may be misunderstood and abused rampantly in the current academic publishing system, and there are certainly better alternatives. But any alternative not founded upon statistical principles would be worse.

      • This is just so counter to experience. Most of the time statistics is used as a smoke screen to conceal poorly conceived and ill-justified papers. Removing the smoke screen isn’t going to make researcher lazier.

        • I agree that the bar is too low to get published in some fields. I agree that the bar needs to be raised. I disagree that removing the bar entirely would improve the situation. Journal policy, study pre-registration, and more rigorous statistical review are clear ways to solve this problem, or at least minimize it. A statistical framework is not a complete solution, nor is it perfect. But any good solution should involve statistics. I don’t see any legitimate alternative.

        • Let me put it this way. There are two general types of mistakes that can be made.

          One is where someone’s unaided intuition doesn’t understand how strong or week the evidence is because they can’t see the implications of given probabilities without doing a calculation.

          Another is where peoples whole intuition about variabilities is so bad they can’t even frame the question right most of the time, let alone understand the answer, or get something useful out of it.

          Heres my claim: Introductory Statistics – as taught these past 75 years or so – reduces the occurrence of the first mistake, while dramatically increasing the second one.

          In my personal experience, I’ve almost never seen someone make the first kind of mistake in the real world, but see the latter more often than not.

  11. It’s difficult to say what we’d be missing but perhaps we can model it!

    An example: RA Fisher’s work in agriculture. (If you look at the Green Revolution, it uses statistics a lot.)

    So maybe we’d have a lot less people and much more starvation.

    • My vague impression about Darwin’s finches is that their analysis had less to do with what we would think of as statistics and more to do with picking out a “type” specimen that “best” represented each island.

      I once asked the curator of the Field Museum why that institution treats their spectacular collection of sculptures by Malvina Hoffman, “The Races of Mankind,” so shabbily. She replied that Hoffman followed the outdate concept of “types.” The implication was that Linnaean “type” thinking tends to exaggerate differences between racial groups.

      I can see her point: The Field Museum’s magnificent life-size sculpture of a Sudanese Nuer is 6’8″ even though the average Nuer is probably a half foot or more shorter. It’s not statistically realistic to depict the average Nilotic as being as tall as Luol Deng of the Chicago Bulls.

      On the other hand, the most noticeable aspect of the Dinka and Nuer of the Upper Nile is their extraordinary elongation (e.g., the Dinka NBA player Manute Bol was was seven and a half feet tall). So, a 6’8″ “type” conveys the lesson “These guys tend to be really tall” better than a 6’2″ statue would.

      The Linnaean tradition predated Quetelet and Galton, and yet it still works surprisingly well.

  12. I love the “Imagine” parody above, but I prefer to think of an action movie trailer narrated by one of those really deep-voiced movieguys.

    In a world

  13. You wouldn’t have autocorrect… would we have anything like the mobile communications we have now without autocorrect? The iPhone and subsequent followups rely heavily on it.

  14. And I guess we could also forget about various unequivocally beneficial technological innovations such as modern pacemakers, hearing aids, cochlear implants, and Clippy … and Stan, surely?

  15. I find it hard to imagine someone wouldn’t have stumbled upon statistics eventually in an alternate history where science continues to progress.

  16. Rational decision making is an almost entirely statistics-driven process, whether that statistics is an explicit science or just some individual intuitively going off previous experience. I think it’s kind of impossible to make decisions without some major or minor form of statistics being involved.

  17. I think part of what you’re missing is the fact that statistics is money saving. It’s not that Gosset couldn’t have figured out the optimal treatments for his hops without inventing the t-test, it’s that he saved a ton of time and money by figuring out how to optimize statistical power and determine the best treatments with fewer resources. Whenever we use statistics to figure something out a little faster, that’s fewer study subjects exposed to risk, fewer patients dying, fewer restaurant patrons poisoned, resources saved and diverted to more valuable endeavors. Maybe it hasn’t shattered paradigms, but the sum of all the savings, in lives and resources, generated by efficient use of statistics since Gosset’s time is probably quite immense.

    • When I worked statistical quality control, two things stood out: (1) the amount of testing was entirely dependent on factors unrelated to my power calculations, and (2) the statistical write-up was merely a respectable veneer applied to conclusions that were already obvious to the data collectors without statistics.

    • “I think part of what you’re missing is the fact that statistics is money saving.”

      Right. Without the Edward Deming revolution in statistical quality control implemented by Toyota a half century or so ago, we’d still have automobiles, they’d just cost twice as much and have four times as many defects. (Warning: all quantities pure SWAGs.)

  18. The best case for taking statistics seriously, in my view, is that countless arguments involve statistical reasoning. Making such reasoning more precise is beneficial inasmuch as it clarifies the nature of such arguments, enabling us to identify their weak points and develop better arguments in turn.

    How much do we lose by reasoning intuitively? In some cases, not much, but as any good statistician knows, tweaking a model in subtle ways often leads to very different conclusions, and intuition alone is usually insufficient to sort out which tweaks make sense and which do not. Without thinking hard about the choices available to the analyst, one may make indefensible decisions, and may consequently arrive at erroneous conclusions. In cases where the conclusion is very clear (e.g., the classic physics experiments you mention), such tweaks don’t matter as much to the big picture.

    A world without statistics would either be a world in which people routinely came to erroneous conclusions, or a world in which people couldn’t say much about anything except a few iconic experiments with unambiguous implications. Unfortunately, people routinely come to erroneous conclusions even though statistics is ubiquitous. This doesn’t have much to do with statistics itself, however, but with misunderstandings of statistics, perverse research incentives, etc. etc.

    What makes our world better is that statistics helps us identify flaws in nominally statistically sound analyses.

    • For example, it’s considered in very bad taste to reason statistically about immigration policy, as we saw in 2013 when Jason Richwine got fired from his job when it was discovered that he had written an statistically sophisticated Harvard dissertation.

      Not surprisingly, American immigration policy is kind of stupid.

  19. At first I was reading this as a clever parody in which every example showed how important statistics is but each was dismissed on its own as insufficient instead of being properly appreciated in the aggregate. Poe’s law, I guess.

    Don’t neglect statistical mechanics when saying statistics isn’t as central to our lives as chemistry or when talking about the development of quantum. If not for Planck’s black body radiation formula leading to quantum theories, Einstein’s measurement of Avogadro’s number through the viscosity increment of a sugar solution to confirm molecular theory, etc., the world would be very different. Stat mech is important from biology to astronomy these days.

  20. Lossless data compression and a great deal of communications technology relies heavily on statistics. Phones and the internet would be a very different place.

  21. I know that I’m biased, but you have completely omitted an essential technology for large-scale social organization, namely finance.

    Granted, statistics as a tool in finance has historically cut both ways, but which tools, real or conceptual have not?

    • So without regression/copulas/CAPM/ and so on, we’d never would been able to finance the industrial revolution?

      Exactly what great thing has finance done that (1) was a benefit, and (2) required statistics?

      • Finance was most definitely a precursor to the industrial revolution.

        In fact, the entire history of human social evolution is tightly intertwined with the advancement of the technologies of commerce and economics. Most of the earliest surviving examples of writing are accounting, contracts and tax records. The systems of clay tokens and “bulla” are widely considered the precursors to the earliest cuneiform writing.

        Many critical developments in mathematics grew out of the demand for commerce education in the in the Italian city states beginning around 1200. Leonardo of Pisa’s Liber Abaci is best remembered for the Fibonacci sequence, but the early chapters of the text are focused on practical problems of commerce. The later chapters on growth sequences being relevant to growth of many forms of real investment. (e.g. Fibonacci’s rabbits)

        An essential form of finance, which existed from the earliest Mesopotamian records, is the forward contract. It became essential as soon as humans began to engage in large-scale agriculture. There is a long history of work by the some of the greatest minds on the problems of formalizing the inherent risks associated with the forward contract. (e.g. Edmond Halley’s 1693 work on the pricing of life annuities, building on the earlier work of Graunt and others.)

        Nineteenth and Twentieth Century finance would not have been possible without statistics, and the industrial, scientific and social development of the age of industrialization would not have been possible without finance.

        • Huelsbeck,

          I think you misunderstood. I’m a finance guy and have nothing against the subject. It’s definitely true finance people have successfully used data.

          But I don’t know anything great that finance did which REQUIRED modern statistical inference or methods.

          Even seeming counter examples like option pricing theory, aren’t counter examples since people successfully used and priced options for many hundreds of years before stochastic differential equations were invented.

        • How about time series / forecasting of things like revenue? What about risk modeling? I can’t fathom how many finance folks (paticularly outside of financial institutions) can do WITHOUT those (and/or econometric) methods . Yet they do. And unfortunately, the flame is rarely held to the output of this work, but rather is taken as gospel.

          Unless you’re a ‘quant’, most finance work within companies, yes, uses data, but does little more than reporting and dashboards of summary data. Many financial groups still keep things in Excel (a potentially very dangerous experiment where information, at its most fundamental level, loses cohesion and can become horribly fragmented due to how inputs/outputs are not well controlled, if at all). Somehow these groups manage to sidestep basic best practices in data, a statistical foundation in and of itself.

          Still, they are often called upon to explain ‘why’ something is or is not based on their output. And in those cases plenty of financefolk are hard pressed to come up with anything but mere conjecture based on ‘data observation’ and their own biases, when a more scientific approach employing statistical inference is warranted.

          Don’t get me wrong, the empirical approach has its place, but not to the degree currently employed in many businesses today. Both statistical thinking and methods are very important and often overlooked litmus tests which may support or challenge the soundness of developed financial information. But then it is also true that many analysts in this arena don’t really care to have their work challenged (and there are indeed pressures which either incite and/or reinforce this behavior). I think it is also true that organizationally, accounting often becomes blurred with financial analysis, when the two functions should intrinsically have separate purposes.

  22. “Without statistics, biologists would not be able to sequence the gene, and I assume they’d be much slower at developing tools such as tests that allow you to check for chromosomal abnormalities in amnio. I doubt all these things add up to much yet, but I guess there’s promise for the future.”

    1) I don’t think you need statistics (in any strong sense) for some sequencing technologies, but 2) sequencing is in active use in medicine already. The most obvious example is HIV drug resistance testing. A patient who is failing therapy gets a few key HIV genes sequenced, and the presence and pattern of resistance mutations dictates whether and which drugs to switch. This matters.

  23. Galton’s major breakthroughs in statistics came a couple of centuries after Newton’s in math and physics, which suggests that there wasn’t all that much demand for statistical innovation.


    The sheer number of innovations that Galton made, and usually at an advanced age, implies that he didn’t have much competition. In fact, he was out by himself not just in finding answers to questions, but in asking the questions that now seem pretty obvious to us.

    I confess to being baffled by why the statistical mindset took so long to get started among human beings.

  24. I suspect the fundamental reason that there wasn’t much demand for advanced statistics until fairly recently was because, as Yogi Berra said, you can observe a lot just by watching. If you pay close attention to something, you can reach, maybe, 80% of the conclusions that you could with advanced statistics.

    For example, baseball now has a statistic called Wins Above Replacement that estimates how many additional wins per season a player contributes to the team total over the kind of nonentity a team can normally pick up from the minors or from another team. This came up last week when the New York Times ran a big story about how Tommy Lasorda and the LA Dodgers had ruined the baseball career of a gay player named Glenn Burke out of homophobia by trading him to lowly Oakland in 1978.

    I pointed out, however, that Burke, although a good athlete, was a bad ballplayer: his WAR statistics were consistently below zero. In fact, Burke got more major league playing time than his performance ever warranted. The Dodgers in fact pulled off a major steal in 1978 by picking up Billy North for Burke. WAR statistics now say that traded boosted the Dodger’s win total that season by 2 or 3 games, and they finished exactly 2.5 games ahead of the Reds to make the playoffs.

    Here’s the thing though: I listened to maybe 50 Dodger games on the radio in 1977, and, long before the invention of the WAR statistic, I knew Burke was bad, that he wasn’t helping the Dodgers win. I had the basic statistics around back then, but mostly I just had my impression of listening to all these games and looking at the box scores for others. I knew Burke was a rally killer. I knew he made too many outs compared to how many times he got on base.

    Listening to Vin Scully for a 100 hours is a low bandwidth way to learn compared to glancing at WAR. But, the point is, it can be done.

    • Steve,

      This seems overstated to me. I don’t see how this supports the idea, “If you pay close attention to something, you can reach, maybe, 80% of the conclusions that you could with advanced statistics.”

      When we talk about conclusions in this way, it sounds like we’re considering whether or not ‘conclusion via observation’ is commensurate with ‘conclusion via statistical inference 80% of the time’. Also ‘paying close attention to something’ also needs some definition. I point out majority political analyst predictions on voting outcomes vs, say, analysis developed by one Nate Silver as a case example. Mere empiricism didn’t seem to come anywhere near the area code of a more robust solution. It was sort of a case of ‘how can one person be right and everyone else wrong’? In this case, the empirical judgments were mostly incorrect. But this is only 1 example.

      I agree with you the *possibility* of concluding correctly by empirical means alone, but what is the likelihood of that, and how reproducible is it? And how domain/problem dependent is that? I would argue at least that correctness in observational conclusions becomes rarer with increasing complexity of a problem.

      Will you elaborate further?

  25. Is *covariance* included in “statistics”? Be odd if it weren’t!

    Without that, there’d be no science. Science’s way of knowing features inference from observation; without the concept of “covariance,” observation would be aimless & valid inference impossible.

    Nor is there anything *obvious* about covariance itself! It had to be invented (by Galton, it is conventionally said; but I think that simplifies) — it isn’t “built in” to how people make sense of experience. Indeed, people intuitively *don’t* organize what they see in patterns that reflect the critical role that covariance plays in causal inference, and even when they are supplied with the information necessary to detect covariance typically don’t make use of it.

    • So before Galton there was no valid inference just aimless observation?

      And people don’t intuitively see when things co-vary or make judgments about how the world works from that?

      • I recently read the impressive 2000 book “Victorian Sensation” by Cambridge historian James Secord about the intellectual reception given the anonymously-written 1844 bestseller “Vestiges of the History of Natural Creation,” which paved the way for Darwin’s 1859 “Origin of Species” by constituting a non-Creationist speculative history of the universe. Secord recounts the reactions to this immensely influential book by hundreds of prominent Victorians, such as Tennyson, Gladstone, Disraeli, Darwin, James Clerk Maxwell, T.H. Huxley, Alfred Russel Wallace, George Eliot, Florence Nightingale, Charles Babbage, Ada Lovelace (one of the two possible authors most frequently accused), Harriet Martineau, and so forth.

        Some of these people, especially the women, had quantitative turns of mind. And yet the one mindset that just doesn’t seem to come up much at all in this book about the brightest people in Britain in the middle of the 19th Century is the statistical. (Galton, who was in his 20s then and mostly hunting, isn’t mentioned.)

  26. Classical Statistics is so good at destroying peoples intuition and understanding, that we’d be considerably better off without it.

    999 out of a 1000 a bright researcher with no statistical training using descriptive statistics will beat the analysis of even the most sophisticated of p-value commandos.

    • To this anecdote, I would like to see some sound support. Just for consideration, what if the problem domain(s) to which you may be referring may be facile enough to where descriptive information is sufficiently meaningful? Then of course, the need for anything more may be diminished. But then that lingering question comes to mind – how do you know?

      Philosophically, many in western society fall into the ‘reductionist’ perspective (perhaps over-reductionist). That is, the view that ‘a thing isn’t complicated until you make it so’ (i think Branscombe has stated something along these lines, but his thought has been overgeneralized from its original intent).

      In cases, this is true, but in others, I’d say unequivocally not. A large portion that inclines us toward reductionism stems from our own cognitive lack in capacity to think in more than 1 or 2 dimensions. Yet ‘simpler’ does not necessarily connote ‘better’, particularly given ones frame/scale of reference about a problem.

  27. The problem is not statistics and statisticians, it’s amateurs drawing conclusions that are completely wrong by misinterpreting numbers. Look at the case of the Dutch nurse Lucia de Berk http://www.luciadeb.nl/english/. Look at the case of the British nurse Ben Geen http://bengeen.wordpress.com/. Lives are totally destroyed because medical specialists make amateur statistical judgements based on bad data. Look at the case of Sally Clarke. Look at all the nonsense about medical screening (breast cancer, bowel cancer, …) which causes untold unnecessary suffering and hardly saves any lives at all but does keep a lot of people busy in important jobs and keeps whole medical industries nicely making money.

    The problem is, that the statistics (ie the numbers, the patterns, etc etc) are there anyway. The world cannot be without them. They are interpreted, used, abused by incompetent amateurs and all kinds of disasters ensue. Really big disasters. We need good statisticians to help counteract some of the worst effects of the bad statistics that are out there anyway.

    • Richard:

      I agree. Your examples are consistent with the Bill James quote, “the alternative to good statistics is not ‘no statistics,’ it’s ‘bad statistics.'”

      There is another problem, though, which is that the existence of good statistics can encourage people to do bad statistics. Consider the crappy “Psychological Science”-style papers we’ve been discussing here during the past year, in particular consider the claim that, from a couple responses from an internet survey, it can be concluded that unmarried women are 20% more likely to have supported Barack Obama during the most fertile days of their menstrual cycle. This claim is ridiculous from many directions but the point I want to make here is that it is the existence of a century of good statistics (from practitioners ranging from R. A. Fisher to George Gallup) that give researchers the confidence to make such a claim. If formal statistics had never been developed, I don’t think anyone with a straight face would make the claim that you can learn universal principles of human nature from this kind of sample and uncontrolled sample. Many decades of exposure to statistical methods has given people the idea that such claims are scientific. In that way, “good statistics” has created the space for lots of “bad statistics.”

      • Yes I agree that modern statistical methodology has allowed a huge junk science ecosystem to gain some kind of aura of scientificity. I recall writing something like this in my list of 12 “theses” attached to my actual PhD thesis, 35 years ago. At the time I was fighting against factor analysis (I saw it as pseudo-science of the worst kind). But I wonder if we couldn’t say this about any recent (or even not recent) scientific revolution – it made a whole lot of good things possible but also created a space for a whole lot of bad things.

        But you are saying something a bit different: that the world would hardly have been different without modern statistics. Well I am not so sure about that. Say what you like about the evils of the pharmaceutical industry but it did not *only* generate money for those inside and suffering for those outside, and as well as the big advances which didn’t need statistics, there have been gradual almost continuous improvements to all kinds of treatments which certainly relied on good statistics being done in order to make it into medical practice. For instance on the BBC this morning a new kind of radiotherapy for breast-cancer which has much less side-effects and really has been proven to be clinically as effective as standard treatments, or better, because of a carefully carried out RCT involving several thousand patients followed up over 10 years or so.

        • Richard:

          I did not mean to say that the world would hardly have been different without modern statistics! Here’s what I wrote: “Statistics is central to how we think about the world. I still think that statistics is much less central to our lives than, say, chemistry. But it ain’t nothing.”

        • Here’s a way to test Dr. Gelman’s idea: certain uses of statistics are banned by law. For example, the federal government not only outlaws the use of race and ethnicity as a factor in deciding whether or not to extend a mortgage to an applicant, but collects a huge amount of statistics under the Home Mortgage Disclosure Act


          to empower community NGOs to easily sue banks for disparate impact discrimination.

          So, here’s a case where our culture has laboriously emasculated itself of one obvious use of statistics. And what has been the result? Well, the world looks much the same (except for how mortgage lending in the heavily Hispanic “Sand States” sort of kicked off the Global Financial Crisis of 2008).

          I suspect you could generalize that: without sophisticated statistical techniques, we’d still have a lot of the same attributes of the modern world, such as airplanes. They’d just cost more and crash more often.

        • To extend this notion to Dr. Gelman’s comparison of chemistry v. statistics, without progress in chemistry over the last 200 years or so, we would have a world made out of pig iron and leather. We wouldn’t have, say, airplanes.

          Without progress in statistics, we’d have airplanes, but you’d be a fool to fly in them.

        • Sailer: mortgage lending in the heavily Hispanic “Sand States” sort of kicked off the Global Financial Crisis of 2008″

          Don’t blame it on latinos. Due to limited IQ we only know how to wash dishes. It was the smart white anglos that did the lending.

        • Sailer does blame those who decided as a matter of policy to give loans based on something other than the merits of each individuals credit worthiness.

          Incidentally Fernando, can we count you among those who believe Latinos should only ever be judged on individual merit, even if that means they get much fewer loans, fewer admissions to ivy league colleges, fewer promotions, and so on?

        • You can count me among those who dislikes racism, prejudice, and pseudo science.

          Incidentally can “we” count you amogst those who thinks being white and anglo is not a merit per se?

        • That doesn’t answer the question. Do you favor, in principle, only ever judging Latinos on individual merit irrespective of what the average consequence that has for Latinos taken all together?

          I do, but I can’t tell whether you agree or not. So if you’re not willing to answer the question, then answer this one:

          Does this stance make me a pseudo scientific prejudiced racist?

          I honestly can’t tell anymore.

        • No Fernando, I don’t think whites have a better God given soul than anyone else. I think in fact, our souls are all of the same nature, or at least, different chips off the same rock.

          I’ve never met anyone who thought otherwise actually.

          I have met people however, who think (1) different subsets of people sometimes have different average metrics, and (2) sometimes those averages can be made to change over time, and sometimes they can’t. (3) sometimes different averages are relevant and should be considered, and sometimes they’re not, or shouldn’t be considered out of basic charity to our fellow man (which we all meed sometimes).

          Moreover, not one of those people I’ve met who held such pseudo-scientific ideas treated other individuals poorly, or refused to judge then as individuals when it made sense, or failed to recognize the inherent similarities and worth of all souls, or ever committed acts of violence against people just because they were different in these ways.

          So I’m at a bit of a loss to explain why anyone considers people who hold such opinions as devil incarnate and the second coming of Hitler.

        • Anon, you do realize that plenty of studies have shown that human beings (black, white, latino) generally have various prejudices against blacks and latinos (more likely to commit crime, less likely to get a job etc.), right?

          This sentence is thus bollocks “Moreover, not one of those people I’ve met who held such pseudo-scientific ideas treated other individuals poorly, or refused to judge then as individuals when it made sense, or failed to recognize the inherent similarities and worth of all souls, or ever committed acts of violence against people just because they were different in these ways.”

          They might not have want to do it, but they will have done it.

        • Yes, I am aware the statistics show that a encounter drawn at random between a white person and a black person is far more likely to result in black-on-white violence than it is to result in white-on-black violence.

          I didn’t want to mention it out of politeness.

          Steve Sailer though, who started this discussion, seems not to have harmed or threatened anyone his whole life.

        • I blame President George W. Bush most of all for the mortgage crisis. As part of his “Ownership Society” campaign, which was heavily oriented toward converting Hispanics into Republican voters by getting them mortgages, he sponsored the October 15, 2002 White House Conference on Increasing Minority Homeownership, where he demanded the real estate industry create 5.5 million new minority homeowners by 2010. In turn, he warned his federal regulators that enforcing traditional downpayment and documentation norms was racist.

          The second biggest villain was Countrywide CEO Angelo Mozilo (not Hispanic, by the way), who wanted to make Countrywide dominant in the mortgage industry by growing from 10% to 30% market share. He saw the chief way to do this through massively increased lending to Latinos. For example, here’s Countrywide’s 1/14/05 press release announcing plans to lend a trillion dollars to minority and lower income borrowers:


          Numerous little-publicized studies since the Bust have shown that Hispanics had extremely high default rates, which is to a sizable extent why the crash started in the Sand States of California, Nevada, Arizona, and Florida. But, this immense fact about the recent past has largely been shoved down the memory hole because our crimethink warning systems have gotten so strong that we can’t even ponder what just happened with out worrying about being racist.

        • The point is that that mortgage meltdown is a perfect example of where there were and remain legal, cultural, and psychological blocks against reasoning statistically. Even among the readers of “Statistical Modeling, Causal Inference, and Social Science,” the debate quickly turns away from the data and into a question of who are the Good Guys and who are the Bad Guys. Not surprisingly, in this kind of environment where we have in effect disinvented modern statistics, things look pretty normal until they break catastrophically.

        • Anon:

          Your claim of superiority / inferiority on the basis of race meets the definition of racist. Ergo you are racist. And like most racists, you claim your racism is “scientific”.

          I just wish that besides a higher IQ you also had some cojones to actually stop beating around the bush and just say out loud: “I am a racist, and I belong to the superior race”.

          Sure, people may dislike you, compare you with Hitler etc but at least you’ll be true to your beliefs. Hell, get it printed on a t-shirt and wear it all day. But don’t play the victim. Not you, the racist. Not you, the superior race.

        • Fernando,

          “Your claim of superiority / inferiority on the basis of race meets the definition of racist. Ergo you are racist. And like most racists, you claim your racism is “scientific”.”

          That’s odd, because I didn’t make any claim of superiority or inferiority on the basis of race or say anything about it being scientific. Unless of course the simple statement “sometimes two subsets of people can have different average metrics” is considered equivalent to denying the laws of physics and a high crime against humanity for which all who utter it should be flayed unto death.

          As if happens, if you’re concerned about IQ, then I’m white, and one look at say the uninhibitedly meritocratic elite universities like Caltec strongly suggest that I’m not part of the superior race, since the students are mostly Asian, and a good chunk of those who aren’t are jewish.

          In abstraction, it’s interesting to think about. When I image my children applying to caltec it’s a bit more disconcerting.

          But life goes on and in neither case does thinking about it make be treat Asians, jews or Latinos I actually know as anything other than unique self aware souls with the same spark of divinity as myself.

          So take your stupid racist slandering and shove it where the sun doesn’t shine.

        • Fernando,

          Just out of curiosity, the Spanish/Portuguese, who are a separate non-white “race” in about the same sense that Italians, Greeks, French, Germans, Welsh, and Finnish are non-white races, came to Central and South America over 500 years ago and governed themselves, or outright ruled, in one form or another almost the whole time.

          So where exactly does this Hispanic inferiority “I’m a victim, I’m a victim ! ” stuff come from?

          Who’s been oppressing all the Fernando’s of the world? It wasn’t me. It wasn’t anybody I’ve ever seen. It wasn’t Genghis Khan, or Queen Elisabeth, or the Cherokee, or the Puritans, or George frikkin Bush, or. So who?

          Seriously, did the UN pass a law while no one was looking saying “Fernando’s can’t study science”? Who exactly has been holding Fernando back to such an extent that you feel the need to troll this blog with your idiotic accusations about my character?

        • Wow.. from a discussion in stats to racism (and apparently some confusion between race and ethnic background given the inclusion of ‘jewish’ in the mix….*sigh*).

          I don’t know if Anon was merely pointing out that different races/ethnicities have disparate measures in areas like education, socioeconomic status, etc etc (particularly in the US), or whether Anon was implying that the origin of these measures were mostly attributable to race/ethnicity. If the latter, um…..wow….that’s quite rich. I hope that was not the intent as that notion , in my understanding, flies in the face of most all modern social statistical research that demonstrates how social systems are effectors of these things (education, ses, social mobility, etc), and how subcultures can develop from those systems (ie the impoverished and the culture of poverty).

          Of course we can devolve and start talking things like craniometrics and reductionistic associations with intelligence a la Pieter Camper (given my nose, my facial angle is no more than 40deg at best…..meh).

        • I have the impression that “modern” racism is linked with *not* embracing uncertainty/variation. It struck me when Andrew posted the discussion about Econmetrics vs. Statistics (hierarchichal moels) and the different cultures. A focus on (more or less … unbiased) average effect estimates while ignoring variation seems to underly most essentialist thinking, especially the more “academic” one.

        • Daniel and Andrew:

          I mentioned it’s possible for different subsets of people to have different metrics, but on a personal level we don’t interact with “races”, or averages, or “metrics”: we interact with individual souls whose futures are not set in stone.

          That remains true whatever results from investigating those metrics: whether it’s good news, bad news, or indifferent.

          This was taken as de facto red-in-the-teeth racism since it’s a tiny deviation from the bizarre ideological belief that everyone is exactly the same.

          So who’s the one having trouble “embracing variation” here?

        • Anon:

          I think many people have difficulty embracing variation. This includes people of all political persuasions, racists and anti-racists alike.

        • I’ve heard a theory (which I don’t know the first thing about, but I’ve seen it mentioned by people who weren’t scientists) that women and men have the same average ability in mathematics, but there are more men on the very high end, and very low end, of the math-ability spectrum. By which I assume they mean: same mean, different stdev.

          So is this an example of bigots refusing to embrace variation?

        • Anon:

          I recommend thinking of this as a two-dimensional space, where one dimension is racism or bigotry or whatever, and a second dimension is willingness to embrace variation. Many thoughtful people, bigots and non-bigots alike, have difficulty handling variation. I think it’s just a difficult concept for a lot of people to understand.

        • Having reread this thread more closely, I don’t believe that you’re being accused (by Daniel, Andrew, or me) of demonstrating a problem with variation (though you’re comment on perceived ‘victimization’ of Latinos raised my eyebrow a bit, as did your counterpoint given CalTech admissions).

          But what this overall discussion implies is the divide between statistics, society, and policy. Statistics can do two major things in this arena: 1)group individuals, and 2)demonstrate the variation among individuals and groups. I believe these are fairly antithetical to each other, and their use errs toward #1 in todays very reductionist (as Daniel mentioned, essentialist) manners of thinking.

          The truth is that, for example, inequalities between fairly defined boundaries such as race/ethnicity are quite complex – more complex than we give credit or care to extend our neural lipids to think about (especially when adding the time domain to changing social structures and systems). Thus, humans continue to remain fairly 1D or 2D thinkers and are inclined to over-simplification – the reliance upon averages, and little more than 2 factor correlations – talk about potential for spurious conclusions.

          Sometimes derived policy from this thinking has a positive effect on dealing w/ social inequality (infrastructure projects to counterbalance ‘white flight’, etc). Other times it can be used quite malevolently (i.e. predatory lending practices). In all cases they are often terribly over-simplistic and their impact attenuates over time.

          Modern society has few simple solutions to deal with modern day issues (wealth gaps and lack of economic throughput just to name two). I believe part of the reason for this is that blanket policies (the norm) are too facile to do what they are intended – uncertainty and variation blows them wide open as they are not constructed with these things in mind.

          As a side track in this thread:
          I think that the ACA was actually a ‘baby-step’ in the right direction, despite its flaws and its over-hurried execution (that definitely didn’t win votes). People complained how ‘long and complex’ it was in its multi-drafted form. And what was not considered were the ‘wildcard’ game players themselves (insurance companies, healthcare providers, state governments, and even businesses – all able to act rationally and ‘cheat’), which would prove to undermine confidence in the basic intents of the act. Who thought to simulate and test risk of various failures before launching? Those who would have embraced variation and uncertainty would likely have considered the potential scenarios that posed risk to such an important act and added in addenda that could have mitigated broad-sweeping risks.

          Unfortunately, this type of thinking, too, was left out of the ACA’s implementation.

          So what is the importance of statistics here? I believe being remiss in statistical thinking as well as strengthening an ethos to uphold statistical integrity (I think Hippocrates fits well here) are the largest divides that the statistical community needs to solve and garner support. Statistics underlie the complexities of the universe we live in. No getting around it.

          Uncertainty and variation are assuredly not becoming more certain and invariant as time marches on. Our thoughts and ways of being must begin to reflect and act upon that understanding.

        • This is much simpler than you’re making it. Do reasonable people have a right to notice things like “there’s lot more Asians at Caltec than blacks or Hispanics” and remain in polite society or don’t they?

          I say they do.

          People like Fernando, whom I’m welling to bet a great deal of money has never had anyone say to him “hey you can’t get a Ph.D. in the hard sciences because you come from a an inherently inferior race” and is really just striking a currently fashionable “I’m a victim!” pose designed to demonstrate their won superior righteousness, believes otherwise.

        • Of course anyone has the right to make an observation about anything. There are no ‘mind police’ in these woods :). Though I would ask of someone who communicates an observation, what their intent is of communicating it. Is it mere naive interest? Or something more? Is the observation in their mind part of a conclusion, or a means to persuade another’s conclusions to their own, or on the other hand a means to gather other’s thoughts about the observation? It is one thing to make an observation. It is quite another to confuse one’s conclusion, or preconceptions about an observation, as merely an observation. And we are, for better or worse, all prone to the latter.

          In the CalTech example, if your statement with respect to the student body composition were of interest/curiosity only, what do you make of your observation? Have you made any conclusions yet? If the point of communicating your observation does not carry with it a preformed conclusion already, why mention the thought of disouragement in the chances of your childrens’ admission there?

        • It’s not “of course”. Sailer is a perfectly harmless guy who comes up with far more interesting psych and sociology hypothesis than any professional psychologist or sociologist I’ve seen. Yet he’s considered such a pariah I’m surprised Gelman lets him comment on the blog.

          This exchange was typical. He mentions the latino-mortgage connection. As far as I know it’s uncontroversial that lending standards were loosened so millions more minorities could get loans. It’s very controversial that this was the main cause of the world wide recession in 2008.

          None of that is mentioned. What’s mentioned is the same old “your a racist Sailer” type comments which are clearly designed to shut down the conversation.

          I was watching one of those videos on youtube the other day called “everything wrong with [some movie]”. I was watching the one for the movie “Gravity”. In it, the commentator noticed there was a Ping-Pong paddle floating in the Chinese space station. They explicitly considered this as evidence the movie makers were racist.

          There is a wide audience of people that truly considers noticing anything differences between people – even harmless ones like a love of ping pong – as inherently racist.

          Nevertheless I can count. I can count how many Asians there are at truly meritocratic admission universities like Caltec. There’s a lot. I refuse to be called a racist for noticing such things.

          Whether the differences in racial makeup at Caltec have causes which are hard to change or not I don’t know. I’ve seen no evidence it’s primarily caused by things easy to change. I deeply wish that it were, since I’d use that to help my kids get in. It would also be hugely better for our society as well.

          Now I could be wrong about this. Maybe there really is some simple change which instantly gives every kid the same chance to get into Caltec. Feed every dumb kid Twinkies from age 4-7 and they turn into Einstein level geniuses as adults. If I’m wrong though, then I’m just wrong. I’m still not evil or a racist or anything.

          I refuse to accept Fernando’s dumb accusations or accept that Sailer is a modern day SS guard whose day job is to man the oven. I’m not, he’s not, and the nut cases who think noticing any differences between people is evil can kiss my freedom loving ass.

        • @Anon:

          There’s multiple things going on here. First, there’s priors. You comment with a clean slate so I’ve no reason to attribute any nefarious motives when you mention, say, Asians in Caltech.

          But when Sailer does it, it comes with a lot of prior baggage since he has a history of using such facts to draw some particular conclusions that many of us do not agree with (to put it mildly). Disagreement is ok, but many would find Sailer’s agenda outright in bad taste. There’s a trend here. A random man walking down the street with a cleaver is innocuous. A known violent psychopath not so much.

          Ironically you could turn around your data based argument here too: Maybe counting Blacks and Asians by itself does not make you racist but we’ve seen a correlation that people who are obsessed with doing certain kinds of measurement (as opposed to say measuring Alzheimers trends) are often later discovered to be racist in other independent ways and settings. In this context, if other people don’t want to join you in counting Asians in University admissions can I blame them?

          So, yeah, if you are going to draw connections, let’s draw other kinds of connections too.

        • Ok Rahul, lets see if I can find Sailer’s alleged racism:

          I’ve seen Sailer rail against leftist hypocrisy on race when limousine liberals live in lily white suburbs and send their kids to lily white schools, but I’ve never seen Sailer advocate the kind of institutionalized segregation in the housing market that existed in (Democratic) Chicago.

          Did Sailer come out in favor of Japanese internment from WW2? How about race based preferences of any kind?

          Does Sailer favor liberal paternalistic policies like the way Australians used to rip children away from aboriginal families to be raised by whites? Sailers comments about the ill-advised nature of lowering lending standards to minorities can be seen as railing against such paternalistic tendencies.

          Is Sailer for or against non-meritocratic admissions policies to the ivy leagues such as allowing legacies and rich donors to send their kids there? (he’s against most likely)

          Is he in favor of the US taking over failed countries down south, in Africa or in the middle east, on the grounds that white people are superior and know what’s better for those people than they know themselves? (he’s definitely against that!)

          The fact is he disagrees with you on some things, and you can use insinuations of racism to shut down his side of the debate. So you do. Cheap shot.

        • @Anon:

          I don’t think we need to worry about shutting down the debate on the other side because even if we did gratuitously insinuate racism I rather think the sort of commentators who frequent the Sailer blog would rather take that as a compliment than a showstopper.

          Whether Steve is or isn’t racist is one debate. But he sure attracts a fan following for which being labelled racist is not much of a tag of shame or concern.

        • Got it Rahul. The only possible cause in all of existence for two groups of people to have different metrics is white male hate. You know this because of science and because people who say different spent their time lynching runaway slaves.

      • “If formal statistics had never been developed, I don’t think anyone with a straight face would make the claim that you can learn universal principles of human nature from this kind of sample and uncontrolled sample.”

        This is an interesting point, but I disagree. People have been forming crackpot psychological theories based on little to no evidence throughout all of human history. Take astrology, or the theory that body type dictates personality [1], or the thousands of other ridiculous theories purported to explain “universal principles of human nature.” I’m not saying that the psychological-science-style papers are much better, but they certainly are not worse.

        Although I agree that poor statistical methodology can appear to give shoddy pseudo-science the appearance of legitimacy, I think this is still better than the astrology-type explanations we would be stuck with in this hypothetical “statistical void,” for a very important reason: any statistical claim has a built-in mechanism for its own falsification.

        Poorly designed studies can be replicated and falsified (assuming someone cares enough about the initial claims to do all the work). Any argument against the legitimacy of astrology can be refuted since “my cousin’s friend had bad luck all his life because he was a virgo born when jupiter was in declension blah blah blah…”

        Take statistics out of the equation, and it’s true: you wouldn’t have psychological science type papers. You’d only have full-fledged pseudoscience. Auras. The zodiac. Ideas that last for hundreds or thousands of years because they are rooted in a framework wherein falsifiability is impossible.

        [1] http://www.gutenberg.org/zipcat2.php/30601/30601-8.txt

        • How about phrenology? The center for phrenological research was among left of center reformists in Edinburgh, such as the Combes family. Robert Chambers, secret author of “Vestiges,” had been an enthusiast of phrenology. But after 1840 it lost momentum and Chambers moved on to development / evolution. It’s not so much that phrenology was statistically disproved as that it failed to develop.

      • Yep. In my experience it was even “card carrying statisticians” who caused a lot of damage eg the first statisticians who were recruited by the prosecution for the Lucia case and just blindly carried out some statistical procedure they had sort of remembered from statistics 101 but did not ask any of the right questions like how did you collect this data?

  28. Pingback: Friday links: zombie ideas in evolution and psychology, a world without statistics, and more | Dynamic Ecology

  29. I have never commented on a blog post, but this is one of the dumbest things I’ve ever read… and I can’t stop myself. Inference underlies every scientific field, from biology to Quantum Physics.

    You sir, are an idiot.

    • Seth you hit on a good point. The most common inference made using statistics is “two groups are not exactly the same, therefore my theory is correct”. This is dumb, it is the worst possible example of affirming the consequent. If the method of making inferences (something underlying ever scientific field) is faulty… I would expect the value of the research to gradually decay to zero or negative (eg Medical treatment is now the third leading cause of death in the US).

      One day I will get around to studying the spread of strawman NHST from educational research to various fields and compare to what practical accomplishments have resulted. I will be very surprised if the relationship doesn’t “hit between the eyes”.

  30. Pingback: Shared Stories from This Week: Jul 25, 2014

  31. I’m not sure this is the proper way to evaluate the importance of a field. Andrew’s post seems to be focusing on the question “how many groundbreaking discoveries wouldn’t have happened without statistics.”

    What if we evaluated the importance of the steam engine, or the printing press by these criteria? Any individual discovery throughout modern history *could* have been accomplished without the printing press, so can we conclude that it wasn’t an important invention, or that the world would not be different without it? We could easily imagine how quantum physics, gravity, cochlear implants (and even Clippy!) *could* easily have been discovered/invented without the printing press. But the importance of the printing press, the steam engine, the agricultural revolution, the computer etc. lies primarily in the fact that they increase the efficiency and pace of development, not the specific discoveries/inventions that would have been impossible without them.

    I’m not at all trying to claim that statistics ranks up there with the printing press, but I am saying that the nature of the benefits from statistcs are similar. The relatively recent development of statistics has vastly increased the efficiency by which conclusions can be drawn from observations. Sure, any individual conclusion probably would have been possible without statistics, but that seems to miss the fundamental value of the field.

    • If you take Stephen Stigler’s view that the age of “Statistical Enlightenment” begins with Galton’s 1885 speech, then an awful lot of the modern world already existed before Statistical Enlightenment existed.

    • “What if we evaluated the importance of the steam engine, or the printing press by these criteria?”

      The book I just finished, Secord’s “Victorian Sensation,” is about the anonymous 1844 publication of an anti-Creationist history of the universe called “Vestiges of the Natural History of Creation.” This influential forerunner of “The Origin of Species” was written secretly by Robert Chambers, a publisher who owned ten steam-powered printing presses, and much of the book is about the impact of the steam engine upon printing and reading in Britain.

      Practically every intellectual in Britain read the book, and much of the commentary upon it that Secord collects is quite modern. But it seems strange to me that all this preceded by about a generation the articulation of a concept as seemingly fundamental as “regression to the mean.” Interestingly, Chambers himself cited Quetelet’s statistical work in “Vestiges,” but he appears in this, as in so much else, to have been out ahead of most of his contemporaries. (Chambers has been occluded by the vast fame of Darwin, and that’s not unjust — much to Darwin’s relief when reading “Vestiges” in 1845, Chambers didn’t come up with natural selection as the mechanism of his Law of Development — but Chambers was a very bright, sensible guy.)

      I could imagine an alternative history of the world in which statistics emerged much earlier. But it seems like the brightest minds from at least the time of Plato onward tended to gravitate toward, say, geometry, astronomy, and physics rather than statistics and the human sciences. I suspect that this had something to do with status and with aspirations. In a messy world, the geometry, astronomy, and physics seemed like the cleanest, highest things to think about. In Raphael’s “The School of Athens,” Plato is pointing up while Aristotle is pointing out, but really Aristotle should be pointing at a 45 degree angle.

      For thousands of years, it seemed only fitting that the highest intellects were attracted to the highest, most heavenly, most abstract subjects.

      Thus, modern statistics tended to evolve out of astronomy and the need to deal with multiple readings from observations, in part because the smartest guys (e.g., Gauss) were drawn to astronomy and physics. The astronomer Quetelet made the leap to the human sciences in the first half of the 19th Century by applying the normal distribution used in the observatory to the coat sizes of Scottish soldiers.

      There’s something a little declasse about statistics. So when a rich gentleman like Galton turned to the subject in the late 19th Century, he had an almost wide open field in which to run amok.

      • Well, I don’t agree that “…the smartest guys…were drawn to astronomy and physics…”, and in fact am pretty sure such a statement is a gross oversimplification. In fact, if anything the opposite would be true, because it’s much more difficult to come up with reliable cause and effect explanations in complex systems (e.g. biology, economics) than it is in the simple systems that are common in physics, especially the physics they studied from the 17th through 19th centuries. Determining fundamentals of gravity pales in comparison to what Darwin worked out, or even Mendel, IMO.

        A strong case could be made that it was in fact Aristotle that first articulated the concept of natural selection, if quotes attributed to him are in fact things that he wrote. More specifically:

        • Riiiiiiiiiiiiight.

          Pick up Newton’s Principia and Darwin’s the Origin of Species and read the first 100 pages of each. Then report back on how much harder evolution was then gravity.

        • “than” would be the word you’re looking for there.

          And note, since you’ve missed it, that the issue is whether or not it’s easier to correctly identify cause and effect in a simple system than a highly complex one. If you think gravity is more complex than the mechanism(s) generating biodiversity through time, probably not much point in talking to you.

        • You missed the point. Whether a system is complex or simple, and whether it’s easy to identify cause and effect, is not an absolute.

          Things we understand seem simple, things we don’t seem complex. Physics didn’t start out with simple problems, it started out with incredibly hard ones and then made them simple.

          If you don’t know enough physics, or the history of physics, to know this is true then there is no point in talking to you.

          And seriously, every physicist I’ve ever met could read Darwin’s book and understand it. I don’t think I’ve met a biologist who could read Newton’s work and understand it. It’s just insane to claim the later was easier than the former.

        • “Physics didn’t start out with simple problems, it started out with incredibly hard ones and then made them simple.”

          Really? What an incredibly stupid way to proceed. Most scientific disciplines I’m familiar with start with the easier problems first, and then tackle increasingly more difficult ones subsequently. I could have sworn work on gravity and light came before that on nuclear physics, quantum mechanics etc.

          I think I’m starting to see where the arrogance and simple mindedness of a lot of physical scientists originates.

        • Jim,

          I believe your background in physics is so week you don’t realize how absurd you’re being.

          I will mention just one example. Physical waves, like you see at the beach, look incredibly chaotic, and seemed helplessly unintelligible to the ancients (unlike say the art of animal breeding!). Nowadays a physicist looking at them will invoke Fourier Analysis from math, and conservation of energy from physics to see the hidden order in those waves.

          However simple those ideas may seem today, historically they weren’t. There was about a 200 year gap, full of incredible effort my many of the most celebrated scientists who ever lived, to go from Newton to a deep and general understanding of Conservation of Energy.

          In the classroom the jump from Newton’s laws to conservation of energy takes a matter of minutes. If it seems easy to you it’s only because physicists made it easy. Not because Conservation of Energy was an easy problem.

          Similarity with Fourier Analysis. Today a high school student who studies the quadratic theorem can progress to Fourier Analysis in half a decade. Historically, though, it took about 4000 years to go from the quadratic theorem to Fourier Analysis.

          If Fourier Analysis seems simple to you today, it’s because mathematicians made it simple.

          So I’ll just leave it at this: actually try to read any 50 pages of Newton’s Principia and then explain why what you’re reading is soooooooooooo much easier than Darwin’s Origin of Species. At least look at the damned book long enough to realize how ignorant you’re being.

        • yeah, except one problem: I wasn’t ragging on biologists or their work. I love biology and wouldn’t wish it to be another physics.

          The point wasn’t that biology or biologists were bad in a way whatsoever (and it definitely wasn’t that physicists could do it better!). The point is that evolution really is a much easier advance than something like Classical mechanics, but came 200 years later. There’s no way anyone can look at Newton’s Principia and Darwin’s Origins and doubt that for a second.

          There’s no straight forward reason for it that I can see.

        • In all seriousness, math and physics have always received the lions share of the extreme high end IQ spectrum. That’s true even if the top of each field (biology, chemistry, physics, medicine, economics) are very similar IQ wise (which has been my personal experience).

          If anything Sailer is understating the point. Classical Mechanics was vastly harder (many orders of magnitude at least) to figure out than evolution or regression. If Mechanics seems like an ordered subject to anyone today that’s only because physicists did an incredible job using deep insights to simplify things; and not because they initially appeared simply in any way whatsoever.

          So why was Classical Mechanics figured out so much earlier than things like Regression?

        • When T.H. Huxley read Darwin’s “The Origin of Species,” he said, “How stupid of me not to have thought of that!”

          Did any of Newton’s colleagues have a similar reaction to the “Principia?”

        • No. There was a serious buildup leading up to Newton. The first recognizably “mathematical physics” on dynamics was done at Oxford in the 1200’s. So there was about a 400 year build up to Newton (including obvious prerequisites like Kepler and Tycho et al).

          But still, it marks a definite non-inevitable break from what came before.

        • Compare Darwin and Galton to their younger contemporary physicist James Clerk Maxwell, who published A Dynamical Theory of the Electromagnetic Field in 1865, halfway in time between “Origin of Species” and “Descent of Man.” Compare the level of mathematical sophistication in Maxwell’s work to the level in Galton’s 1865 book “Hereditary Genius.” Most of Galton’s statistical breakthroughs came after 1865 as an attempt to shore up the ideas he brought up in that book but could only treat with math that was somewhere between obtuse and hand-waving.

        • Sure there Maxwell’s E&M was (and still is) more mathematically involved than statistics. But there’s a technical point worth considering. Before cheap and easy computation power existed, anyone doing work had to rely much more heavily on analytical results from mathematics.

          Clean mathematics can compensate to some extent for the lack of number crunching.

          Statistics in particular relies on two developments: the mathematics of exponentials ( and not just because of the normal distribution: http://en.wikipedia.org/wiki/Exponential_family) and matrixes.

          Abstract and Linear Algebra (matrices) were rapidly advancing throughout the 1800s.

      • A different aspect of all was that even though we think today of physics as a supremely practical subject, that’s not how it was viewed in Newton’s time.

        Physics back then was highfalutin philosophy coming out of the same milieu, and something akin to, those arguments over how many angels could dance on the head of a needle. Practical engineers like Leonardo Da Vinci didn’t go to university like Newton and didn’t study the kinds of things Netwon did (although in Da Vinci’s case he tried to study some of it himself but almost entirely failed).

        It may be that mathematical physics needed a far more philosophical/scholastic environment to germinate, whereas something like evolution or statistics needed a more down-to-earth empirical/practical field to grow on.

      • The area where we are most likely to agree is that each early scientific effort attempted to provide a logic to ‘answers’ that was then more practical, or at least pallatable….yes/no, true/false. The logic of uncertainty was far less a practical matter as it was a ‘mystery of the ether’ so to speak.

        Given enough time and a few smelling-salts starting in the late 1500s with Simon Stevin and Tycho Brahe to the mid 1600s with Pascal/Fermat/Graunt/Petty, the need to understand uncertainty began to take shape. Still, it took around 200 years for a larger disciplinary focus that began modern stats to catch on.

        If one considers the rate of mathematical vs statistical evolution, the latter is by comparison quite the junior. Society still is mostly inclined to ingests things in the universe as ‘certain’ : true/false or yes/no. Interestingly some of that still remains in various (generally poor) understanding of statistics, which really hasn’t (or at least shouldn’t have) an axe to grind in certainty.

  32. Here’s a possibility for the weird lag in statistics versus physics: geometry was pretty sophisticated back in Ancient Greek times, but numbering systems in Europe were not. Doing simple things like calculating averages using Roman numerals wasn’t much fun. There’s a famous quote from the Middle Ages in which a merchant inquires about where to send his son to college, and is told that any respectable college in Germany can teach him to add and subtract but if you want him to master multiplying and dividing, you’ll have to send him to Italy for his higher education.

    Would simple baseball statistics like batting average and ERA have become popular if we still used Roman numerals?

    Hence, Newton used his revolutionary (and secret) calculus to work out his physics discoveries, but published his book using only geometric proofs. The use of geometric proofs meant that the first 100 or so people in Europe who read Newton’s Principia almost unanimously agreed that he was right, so no revolutionary work of science ever was instantly greeted more rapturously. The best educated Europeans in 1687 were very good at Euclidean geometry, as they had been for most of the last 2000 years. But Europeans had only been using modern Arabic numbers for maybe a tenth or an eighth as long as they’d been using Euclidean geometry.

    • Yeah, but while the Greeks where the giants of geometry, the Hindus were the giants of arithamtic/algebra. The later was transmitted to Europe, via the middle east, absorbed and improved a long long time before Galton.

    • To give some dates here:

      Europeans solved the cubic and quartic equations by the mid 1500’s. About the same time Cardano wrote his manuscript on games of chance.

      That’s a long time before Galton.

  33. So, I have kind of a Sapir-Whorf theory of statistics — Europeans didn’t have a very good system for doing simple things with numbers like taking averages until about a half-millennium ago. In contrast, geometry was an extremely prestigious part of higher education in Europe in the Middle Ages.

    It would be interesting to see if there was another culture that had a more useful numbering system than Roman numerals that developed more interest in statistics earlier. I’m wondering whether Japan might be like that? I don’t know what kind of numbering system the Japanese used, but they were calculating sports statistics (for sumo wrestling) by the later 17th century.

    • My thinking is going in very different direction. If the number system was good enough for Newton’s teacher to prove the fundamental theorem of calculus and check it with examples, then it’s plenty good enough to take averages or standard deviations.

    • Part (and only a part) of it might be that early problems in mechanics (statics and the balancing of forces for building all those aqueducts and coliseums) got mathematized early.

      Initially, the burden of having to use real math would be a hindrance, but over the long run it allows for a very long chain of insights all building on each other over centuries.

      There are hundreds of deep mathematical and physical insights needed to develop mechanics to point at which Lagrange/Hamilton left it.

      There are perhaps a dozen or so for evolution.

      The later should have been easier, but because those insights were qualitative in nature, they required great leaps of imaginative intuition, whereas the former could build up over long chains of increasingly sophisticated reasoning over centuries.

      I could imagine an alternate history where if the greeks had of worked on just the right seed problem (even if superficially unrelated to evolution and statistics) and developed the mathematics to answer it, it would have snowballed over time and lead to something like evolution much much sooner.

      • “if the greeks had of worked on just the right seed problem”

        The importance of plant and animal breeding in inspiring the Darwin-Galton-Fisher line is hard to overstate.

        It doesn’t seem impossible that the Greeks could have gotten into scientific agriculture, but nobody seems to have done much with it before the 18th Century. Very strange …

        • The Greeks were serious about animal breeding.

          The Greeks probably did scientific agriculture. Almost all Greek science is lost. All we have left is what the Romans understood; in the case of agriculture, what Varro understood. But he didn’t understand much. He says that he read 50 books of Greek agriculture and it was mostly “philosophy” so he threw it out.

        • Steve,

          This was John Maynard Keynes’s take (footnote 3 page 91 “A Treatise on Probability”)

          “The continuity and oneness of modern European thought may be illustrated, if such things amuse the reader, by the reflection that Condorcet derived from Bernoulli, that Godwin was inspired by Condorcet, that Malthus was stimulated by Godwin’s folly into stating his famous doctrine, and that from the reading of Malthus on “Population” Darwin received his earliest impulse.”

          There you go: Jacob “large numbers” Bernoulli -> Darwin.

  34. Without statistics, there would be no Internet search. Do you remember the world before Google? The world before Altavista?

    Without statistics, there would be no quantum chemistry, and therefore almost none of the modern drugs.

    Without statistics, there would be no discoveries of exoplanets. Or any other difficult discoveries in astronomy.

    There are so many more examples than just those mentioned in the article…

    • Anon:

      Is that really the case regarding Google etc? You might be right, but I imagine that a pretty good version of Google could be done using linear algebra with no statistics. Quantum chemistry I don’t know about, and as for exoplanets, that’s all fine but it’s not changing our lives.

      • Most anything ‘quantum’ in physics is filled with stat principles just by virtue of its uncertain nature. Statistical physics has been slowly emerging as a subdiscipline (including stochastic processes (ergodic and non) and linear-nonlinear dynamical systems). Lots of influences in other areas of physics for this emergence.

        As far as Google goes, yeah, I agree you can pull off a search engine using purely numeric/matrix methods. With many stat software products, this is generally what goes on under the hood with statistical packages – one really isn’t left with a choice from a computational perspective (if you ever wish to see output within your lifetime anyway). I do wonder however how relevant page ranking would perform differently between pure matrix and multivariate stat models.

  35. Daniel Kahneman said in Thinking Fast and Slow: “I have yet to meet a successful scientist who lacks the ability to exaggerate the importance of what he or she is doing”. Andrew, based on your post, all I can say is, I guess he hasn’t met you! Extremely impressed by your ability to objectively reflect on the importance of your field. That’s quite an accomplishment.

    • Alex:

      I am no counterexample, as I have the ability to do this exaggeration—just read some of my grant proposals sometime! (They’re public information, at least I think the funded ones are.) But there is something refreshing about blogging here, as I can pretty much directly write what I think.

      • Do you really think you might have a Gremlin? I had in mind like a 1930s car full of gremlins. Modern SPC is pretty important not only to scaling up production, but to building products with a lot of parts requiring precise tolerance. It’s hard to imagine WWII production without Shewhart’s methods on the shop floors. They weren’t as sophisticated as modern SPC, but were pretty mainstream.

        On a lighter note, beer would not be so good and likely much more expensive because Gossett would have never introduced statistical testing to the Guinness brewery.

      • Andrew,

        I apologize for not thinking a bit more before responding. I realized that I meant to say that STP is much more than quality control. SPC is literally about designing and running the process, predicting failure before it happens, keeping the line running, improving throughput and so forth. Understanding of variability is built in pretty thoroughly and it is hard to imagine a replacement. IT might make a good sci-fi story if one has sufficient imagination.

        In my world of software development, I have seen a large body of thought that eschews most statistics, thinking the underlying process not just variable. My observation is that they hit a wall. This is a much longer discussion, but they achieve a performance level that solves some problems, but they don’t get to the next level. I am reminded of the way Moneyball keeps repeating the lessons of small samples and the Deming Funnel experiment. Then I consider what the upper bounds for improvements in crop yields, medicine, microchip manufacture, and so forth might be absent statistical reasoning about process change. Engineering of the process is linked to engineering of the products. Perhaps there is an alternative way to break through that barrier, but understanding and managing the underlying variation seems more or less essential to much of engineering.

  36. Quantum mechanics depends on probability, but that is not the same thing as statistics in a Fischer or Neyman-Pearson sense. Likewise, statistical mechanics is more probability than the statistics we normally recognize. If you throw out probability and quantum mechanics, relativity becomes a lot less relevant, more of a curiosity. On the other hand, empirical physics would have hit a wall long ago if we couldn’t characterize the uncertainty in measurement.

    Newtonian mechanics works pretty well, except that that real systems are more or less chaotic. Laplace’s work with statistics made more precise calculations (rightly ignored by Newton to develop the abstract rules) possible. Anyway, imagining a near alternative in which humans have some sort of cognitive statistical circuit breaker seems pretty hard.

  37. Pingback: Links 7/30/14 | Mike the Mad Biologist

  38. I remember hating to calculate experimental errors in physics. Going out on a limb, Newton probably didn’t need to worry much about statistics because:

    1) He was able to minimize it in his experiments
    2) The bulk of his work was in fields with relatively few random variables
    3) He repeated his experiments multiple times (more times than strictly needed?)
    4) He had a good idea of how relevant the experimental errors were to the derived equation (calculus can tell you that, even without calling it statistics)

    Statistics are useful for handling any mathematical situation that isn’t readily black & white–which happens to be most things.

    • I think the simple answer is that Newton didn’t need to worry much about the precision and error in his work. For example, the elegance of his orbital mechanics trumped everything else and derived Kepler’s equations. The errors were largely systemic, for example gravitational perturbations, rather than random. These got worked out over the next couple centuries. What was it George Box said?, “all models are wrong, some are useful”.

      Newton did a lot more, sound propagation, light refraction, spectral decomposition, laws of motion, reflecting telescope but looking back, his experiments look more like inspired demonstrations rather than the precise and highly skilled experimental work. His conceptualization of the underlying physics into mathematics models was the kind of genius we see only once every couple centuries.

  39. A world without statistics! Oh what a nightmare! Economists are as useful to nation building as scientists. New Drugs for both animals and humans are tested over and over gain using statistical methods before being by the FDA. data generated from these tests are anlyzed using statistical methods, RIGHT?? New equipment of all sorts at factory level (fighter jets, automobiles, computers, etc,etc,) is tested before large scale assembly line production for the public is commissioned. Aren’t the data generated and analyzed using statistical methods?? All testing of manufactured products involves some form of Statistical methods applications.

    The US Congress often makes decisions based on evidence presented by Economists who cannot convince anyone with anecdotal evidence but with conclusions derived from Econometric models or Mathematical Programming models or variants of both. Central banks in the EU and the Federal Reserve in the US base their decisions on research findings using one form or the other of Statistics based methods (Econometric methods whether Time Series, Panel Data analysis or Cross Section data analysis).

    Barnabas Kiiza
    Makerere University

  40. Pingback: basic statistical concepts

  41. I know this article is a few years old now, but I came across this website and book that pushes something called “observation oriented modeling” as an alternative to inferential statistics itself (including NHST and Bayesian philosophy). The website is [here](http://www.idiogrid.com/OOM/). Frankly, I am pretty skeptical. I figured I’d comment here since it relates to imagining a world about statistics.

  42. Pingback: Stage-setting readings and videos to kick off an intro biostats course: here are mine, please share your suggestions | Dynamic Ecology

Leave a Reply

Your email address will not be published. Required fields are marked *