The most-cited statistics papers ever

Robert Grant has a list. I’ll just give the ones with more than 10,000 Google Scholar cites:

Cox (1972) Regression and life tables: 35,512 citations.

Dempster, Laird, Rubin (1977) Maximum likelihood from incomplete data via the EM algorithm: 34,988

Bland & Altman (1986) Statistical methods for assessing agreement between two methods of clinical measurement: 27,181

Geman & Geman (1984) Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images: 15,106

We can find some more via searching Google scholar for familiar names and topics; thus:

Metropolis et al. (1953) Equation of state calculations by fast computing machines: 26,000

Benjamini and Hochberg (1995) Controlling the false discovery rate: a practical and powerful approach to multiple testing: 21,000

White (1980) A heteroskedasticity-consistent covariance matrix estimator and a direct test for heteroskedasticity: 18,000

Heckman (1977) Sample selection bias as a specification error: 17,000

Dickey and Fuller (1979) Distribution of the estimators for autoregressive time series with a unit root: 14,000

Cortes and Vapnik (1995) Support-vector networks: 13,000

Akaike (1973) Information theory and an extension of the maximum likelihood principle: 13,000

Liang and Zeger (1986) Longitudinal data analysis using generalized linear models: 11,000

Breiman (2001) Random forests: 11,000

Breiman (1996) Bagging predictors: 11,000

Newey and West (1986) A simple, positive semi-definite, heteroskedasticity and autocorrelationconsistent covariance matrix: 11,000

Rosenbaum and Rubin (2004) The central role of the propensity score in observational studies for causal effects: 10,000

Granger (1969) Investigating causal relations by econometric models and cross-spectral methods: 10,000

Hausman (1978) Specification tests in econometrics: 10,000

And, the two winners, I’m sorry to say:

Baron and Kenny (1986) The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations: 42,000

Zadeh (1965) Fuzzy sets: 45,000


But I’m guessing there are some biggies I’m missing. I say this because Grant’s original list included one paper, by Bland and Altman, with over 27,000 cites, that I’d never heard of!

P.S. I agree with Grant that using Google Scholar favors newer papers. For example, Cooley and Tukey (1965), “An algorithm for the machine calculation of complex Fourier series,” does not make the list, amazingly enough, with only 9300 cites. And the hugely influential book by Snedecor and Cochran has very few cites, I guess cos nobody cites it anymore. And, of course, the most influential researchers such as Laplace, Gauss, Fisher, Neyman, Pearson, etc., don’t make the cut. If Pearson got a cite for every chi-squared test, Neyman for every rejection region, Fisher for every maximum-likelihood estimate, etc., their citations would run into the mid to high zillions each.

P.P.S. I wrote this post a few months ago so all the citations have gone up. For example, the fuzzy sets paper is now listed at 49,000, and Zadeh has a second paper, “Outline of a new approach to the analysis of complex systems and decision processes,” with 16,000 cites. He puts us all to shame. On the upside, Efron’s 1979 paper, “Bootstrap methods: another look at the jackknife,” has just pulled itself over the 10,000 cites mark. That’s good. Also, I just checked and Tibshirani’s paper on lasso is at 9873, so in the not too distant future it will make the list too.

69 thoughts on “The most-cited statistics papers ever

    • Jeff:

      This example illustrates how certain fields such as biology have so many cites. I recall learning a few years ago that low-ranking bio journals have higher impact factors than high-ranking statistics journals. There’s something just so wrong about an application to the bootstrap having so many more citations than the original bootstrap paper!

      And, looking down on the same page as that paper on Google scholar, I find this one:

      12,000 citations: MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods
      K Tamura, D Peterson, N Peterson, G Stecher… – Molecular biology and …, 2011 – SMBE

      and this:

      9000 citations: A simple, fast, and accurate algorithm to estimate large phylogenies by maximum likelihood
      S Guindon, O Gascuel – Systematic biology, 2003 –

      I just refuse to count these as among the most-cited statistics papers!

      • Andrew –

        Surprise surprise we disagree! :-)

        I think that the process of focusing on a specific application can be just as important as writing down the most general case. It is a disservice to those fields and to our field to discount them. As a case in point, that phylogenies paper actually inspired even more general statistical work on the “problem of regions”

        If you don’t count those as most cited statistics papers you should remove all citations to the above papers that came from someone’s software that implemented those methods. How many citations would GEE/Kaplan Meier/Cox regression/FDR/Arch/Bagging/Boosting have without the software that allowed users to implement them/use them?


        • Jeff:

          I don’t disagree that these papers are useful, I just think you have to draw the line somewhere.

          Here’s how I see it: if someone develops new statistical theory or methods, then that’s a statistics paper, and I count as citations all the other statistics papers that cite it, and also all the applied papers that cite it. But if someone takes an existing statistical method and ports it to another field, I don’t consider it eligible for the “most cited statistics papers.”

          The econometrics papers on the list, by the way, I do not consider as porting of existing statistical methods. On the contrary, those econometrics papers are statistics research that happen to appear in non-statistics journals.

        • “But if someone takes an existing statistical method and ports it to another field, I don’t consider it eligible.” Doesn’t that mean that if you looked hard enough, none of them would be eligible?

          Speaking of psychology:

          * 16,384 cites: Campbell, D. T., Stanley, J. C., & Gage, N. L. (1963). Experimental and quasi-experimental designs for research (pp. 171-246). Boston: Houghton Mifflin.

          * 12,644 cites: Campbell, D. T., & Fiske, D. W. (1959). Convergent and discriminant validation by the multitrait-multimethod matrix. Psychological bulletin, 56, 81-105.

        • More:

          13,553 cites: Cohen, J. (1992). A power primer. Psychological bulletin.

          10,430 cites: Shrout, P. E., & Fleiss, J. L. (1979). Intraclass correlations: uses in assessing rater reliability. Psychological bulletin.

      • Continuing along this theme, I google-scholar’d *maximum likelihood* and this appeared:

        11,000 citations: Refinement of macromolecular structures by the maximum-likelihood method
        GN Murshudov, AA Vagin, EJ Dodson – Acta Crystallographica Section D: …, 1997 –

        And a search on *Bayes* yields:

        13,000 citations: MRBAYES: Bayesian inference of phylogenetic trees
        JP Huelsenbeck, F Ronquist – Bioinformatics, 2001 –

        and another 14,000 for MrBayes 3: Bayesian phylogenetic inference under mixed models
        F Ronquist, JP Huelsenbeck – Bioinformatics, 2003 – Oxford Univ Press

        You get the picture.

        But then I found this one, which I’d never heard of and has an amazing 24,000 citations:

        A new look at the statistical model identification
        H Akaike – Automatic Control, IEEE Transactions on, 1974 –

      • What’s wrong is paying too much attention to citation counts. If I was famous & published some egregiously crappy heresy I bet I’d get cited a lot too. For the wrong reasons, of course.

        How many cites did Daryl Bem get for his paper?

  1. Bland/Altman is the most beloved medical researchers’ paper; the plot of a-b against a+b is something like the unnormalized residual plot, and supposedly is called Tukey plot elsewhere (never saw this, however).

    When medical researchers review statistic (they do it a lot, I fear), they check for the presence of the following “always correct” terms:

    a) p-values
    b) t-test
    c) p-values
    d) wilcoxon
    e) p-values
    f) Bland-Altman plots
    g) p-values
    h) Semi-professionals: Area under curve (AUC)
    i) p-values

    Everything else should be returned, because it’s modernistic stuff nobody understands. If you are lucky, the journal has a statistical reviewer.

    • Bland and Altman are slightly chagrined that their contribution is considered to be the Bland-Altman plot. Bland says “we never claimed priority for plotting difference versus mean and that adding the limits of agreement was our contribution”.

      • Jeremy, I fully agree. Both Bland and Altman have published the most lucid papers on statistics that can be understood by mere mortals. And Altman is an excellent reviewer.

  2. What we really need is a cumulative citation index. Take the bootstrap as an example. Lots of people use the bootstrap but have never read the original article. There’s a good chance they read one of the multitude of “Using the bootstrap in the field of X.” The best ideas experience this second wave of dissemination, papers that don’t really add anything new, but recast and reword the original work to appeal to a broader audience.

    On Baron and Kinney, I suspect at least 1/2 of the citations are refuting the work. Any time one prominent article comes out applying mediation analysis, there are at least 5 papers written to refute the original BK approach.

    • What I think we need with some urgency is a (possibly crowd-sourced) database of papers indexed by the statistical methods used. That way not only could you trace what people learnt (or didn’t) over time about how to do Method X, you’d also be able to find nice and nasty examples of writing it up.

    • Especially the works about interactions/mediaton etc. seem to be cited very often indeed. Is there a special reason for that? And why on earth are so many publications wrong if the basic articles are cited so often? Well, I suppose because nobody is actually reading them…

      • If you see Baron and Kenny cited nowadays, it tends to be to say “The Baron and Kenny approach to mediation is known to be underpowered and so we used …”, but if you don’t mention Baron and Kenny, reviewers ask why not.

        • Actually my methods professor (in sociology) cited them and used their approach in addition with the Sobel-Test… thus I also used them in – I think – two assingments, maybe even in my Bachelor thesis, I don’t remember it exactly (only that I also mentioned something about Hayes and Preacher and bootstrapping being a better alternative…). Anyway, that’s not so much my point. There are obviously some influential papers about interaction and mediation which are cited very often but on the other hand a really huge amount of papers in Sociology are making basic mistakes in interpreting interactions, like interpreting the coefficient of X or Z as “main effects” even if X*Z is part of the model and neither X nor Z could equal zero. It’s probably at least as bad as shown by Brambor et al. (2006) for the political sciences…

  3. This weekend I learned how these citation indices have exploded over the last ten years. (Put in various high profile individuals – or even average people – and see how citations have accelerated exponentially.) This is all good and understandable in terms of the internet and the rise of statistics etc etc. But is it also a sign of publication inflation? Publications have a kind of currency value in academia. The trend feels bubble like – although as has been discussed here before publications don’t really have a historical metric or intrinsic value to be constrained. Has the value of scientific output expanded by a factor of 5 to 10 over the last decade? To me these citation trends are simultaneously exciting and worrying.

    • There is a quote out there that goes something like: “Soon the journal articles will be being published faster than the speed of light, fortunately this does not violate the laws of physics since they convey no usable information.” I couldn’t find it right now, but remember the original being worded better.

  4. I tabulated the journals the papers at the top (excluding those in the comments as I’d got bored by then) appeared in, and it’s interesting how few appeared in the ‘top’ statistics methodology journals: JRSSB got 3, JASA 1, Biometrika 2, Ann Stat 0. In contrast, Econometrica has 5, Machine Learning 3, and even the Lancet 1 (Bland and Altman). And yet I know of statistics departments where they’re only looking for papers in the ‘top’ methodology journals when considering prospective faculty, and anything else is disregarded.

  5. Basic Local Alignment Search Tool
    SF Altschul, W Gish, W Miller, EW Myers, DJ Lipman, Journal of Molecular Biology, 1990

    49,949 citations according to Google scholar. Maybe not a pure statistics paper, but it established the importance of melding statistics and algorithms to make big data analysis feasible. And more highly cited that any other paper in this discussion.

    • I think this puts older papers at a significant disadvantage.
      Maybe applying some kind of weighting taking into account the number of papers published by the journals citing the paper?

  6. How about the paper “Statistical aspects of the analysis of data from retrospective studies of disease” (Mantel & Haenszel, 1959) with 10255 citation on Google Scholar? A rather influential paper for epidemiologists and other (bio-)medical researchers.

    I’ll hazard a note that the paper by Bland and Altman is more or less a port or a slight refinement of the Tukey’s mean-difference plot to the medical field. In addition, Eksborg (1981) proposed a similar method based on Deming regression.

    However, such often seems to be the case. A method/software is taken into use in some new field, and it starts acquiring citations fast. There are several other examples of such papers on this list, such as the Felsenstein’s paper, but without it the bootstrapping methods might not be taken into use in taxonomy, since the uses of bootstrapping in the taxonomy are not instantly obvious from Efron’s papers or book(s).

    Somehow back-propagated citations in such cases could be nice to have, though.

  7. Also:

    19,000 Multiple range and multiple F tests.
    DB Duncan – Biometrics, 1955 –

    14,000 Statistical analysis of cointegration vectors
    S Johansen – Journal of economic dynamics and control, 1988 – Elsevier

  8. 1. How could you miss the BIC Paper:
    Estimating the dimension of a model
    G Schwarz – The annals of statistics 1978, Cited by 20560

    2. The following paper:
    Inference of population structure using multilocus genotype data
    JK Pritchard, M Stephens, P Donnelly, Genetics 2000, cited by 11531

    Title&journal look like a genetics paper but written by statisticians and introduced a new statistical model (very similar to latent diriclet allocation later presented by Blei,Ng&Jordan)

    • Oz:

      Oohhhh, I hate that BIC thing! And how horrible to think that this paper got more cites than AIC. But I promise I didn’t exclude it on purpose; I just didn’t think of looking for it, nor did I come up with it in any of my searches.

      Searching some more I came across another Google Scholar error:

      24,000 Generalized linear models
      P McCullagh – European Journal of Operational Research, 1984 – Elsevier

      I’m pretty sure these are mostly references to the McCullagh and Nelder book of the same title.

      • Why do you hate BIC so much? Since reading it first I really liked it – elegant, short and just ‘feels right’. But I’m far from an expert so would like to know – what are the problems/drawbacks of the criteria? do you have any valid criticism to justify your emotional reaction?

        • In some multivariate modeling examples (glm, gnm, glmm, etc), I agree with you. BIC in isolation doesn’t appear to provide the same type of robustness about information ‘lossiness’ that it’s AIC counterpart does in these instances (I’ll leave AICc alone for now). Still having both measures as guidance in model development can be far better than one (i.e. how during model development, AIC and BIC can interestingly oppose directions serially as variables of low predictive power are begun to be added to a more ‘established’ model – I’ve found this phenomenon to be an interesting indicator for when to rerank remaining variables (i.e via stepwise) not yet tested for inclusion/exclusion in/from a model).

  9. @Peter: April Fool’s……right? :)

    While the meshuggas continues in the frequent guffaws over citation frequency (please indulge further with extreme prejudice and sarcasm – and note if something ‘data sciency’ happens to spring forth with some fantastical citation volume), some of the ‘shockers’ mentioned even in jest I think should prompt questions over the nature, and potential tractability of the classification of the citations.

    I’ve often been curious about how citations might be employed. Is the citation a fairly insignificant,or even downright irrelevant blurb? Is the citation critical to the research or its argument? If so, in what way and to what degree?
    Was there a referee/faculty/other request to add it (“Why didn’t you reference so-and-so’s ‘seminal’ paper on xxxx? You might want to consider this. Kind regards, Joey B. Gatekeeper”). Yep, here we go with impact factor rererethought :).

    To tag onto the Gscholar thread, I’ve noted more folks using this tool as a single source of citation searching. I’ve noted Wikipedia refs killing entries dues to lit published in the 70s, which naturally Gscholar missed, and the armchair refs never thought to consult another citation index. Since when did Gscholar become the bastion of citation indexing? This could be a fun discussion :)

  10. One of the great injustices of citations is that Oaxaca-Blinder got all the credit for Duncan’s decomposition method, which he developed and published about a decade before Oaxaca-Blinder came along. As I recall, O-B recognized this, but that didn’t stop future generations of economists from citing the economists’ derivative paper instead of the sociologist’s original paper. A pretty typical pattern for economists, really.

  11. I am self-interested, as the author of that 1985 paper on applying the bootstrap to inference of phylogenies, which has 25K citations in Google Scholar. Should that be included? It gets more citations than Efron’s paper on the bootstrap. Is that just?

    I think that it just depends on what you want to count. Some fields have lots of people using particular statistical methods, and the papers they cite rise high in citation listings. The easiest way to correct for this would be to cite how often particular papers are cited in statistical journals, excluding the journals in application areas. If you did that, Efron’s paper would rise and mine would nearly disappear. (I will note that in the printed collection of Efron’s papers recently published, I commented on the Efron-Halloran-Holmes paper on bootstrapping phylogenies, and made exactly this point).

    I am proud to have published a paper that is about 7th in all of statistics, in citations by scientists. But I am happy to acknowledge that in terms of influence within statistics, my paper is a minor one. And that that would be reflected in its being far down the list of statistical papers frequently cited by statisticians.

    But what I think is a recipe for wrangling and confusion is to not make this distinction, to take widely cited papers on applications of statistics and try to come up with some reason why they “aren’t really statistics”. You end up becoming the Statistics Police, and making yourself look silly.

    • Joe:

      I would not want to restrict the count to citations in statistics journals, as I’m particularly interested in statistical methods that have been applied more generally. It happens that biology has a much larger scientific literature than, say, political science (for good reason: we put huge resources in biology so as to ultimately save lives and make people less sick and more comfortable!), and so methods that have particular relevance to biological applications get more citations. That seems fair to me. If a statistical method is important in biomedical research, it’s important in an absolute scale. If a statistical method is important in political science, or astronomy, or some other relatively small field, that’s fine but it’s ultimately having less of an applied impact (at least, as measured by citations, which seems to me to be a recent measure if not perfect). So I have no problem looking at papers that are cited in the biology literature, not at all.

      So, just to clarify, my measure of popularity in the above post is not “influence within statistics.” I really am concerned with influence in the scientific literature.

      In my post above, I wanted to distinguish between papers that developed new statistical methods from those that apply existing methods. I assume that a paper such as “Refinement of macromolecular structures by the maximum-likelihood method” is highly valuable (given that it has been cited over 10,000 times) but I’d rather not include it on the list of most-cited statistics papers in that, unlike the papers of Baron and Kenny, say, or Zadeh, or Heckman, or the others on that list, it’s not presenting a new method, it’s presenting the application of an existing statistical method. The boundaries here are not precise, but a classification is not useless just because it has some necessarily subjective aspects. My rule of not counting explications of existing methods in the list is similar to my rule of not counting books.

      Finally, just to be clear because otherwise readers of your comment might not realize: nobody in the above thread referred to your paper as “not really statistics.”

      • Sorry if I misinterpreted you as saying my paper was “not really statistics”. I must have misread this comment of yours:

        if someone develops new statistical theory or methods, then that’s a statistics paper, and I count as citations all the other statistics papers that cite it, and also all the applied papers that cite it. But if someone takes an existing statistical method and ports it to another field, I don’t consider it eligible for the “most cited statistics papers.”

        It was true that my paper was highly cited, but it was not considered true that “that’s a statistics paper”. I will accept your assurance that it is a you never said that it “wasn’t really statistics”. That you said and meant that it was statistics, in a paper, but just wasn’t a “statistics paper”.

        I do think that your cri de coeur that

        “There’s something just so wrong about an application to the bootstrap having so many more citations than the original bootstrap paper!”

        is misguided. That’s just the way applied statistics often works, not a flaw in the system.

        I do think that influence within statistics would be interesting to measure, and observing which statistical papers are cited by other statisticians would be a possible approach, requiring manual intervention mostly in choosing an initial list of statistics journals. Your approach requires judgement calls on individual highly-cited papers, a different stage of the process.

        I would expect that in any such list of statistics papers influential with statisticians, mine would be basically absent.

        • Joe:

          That’s right, your paper was highly cited and it is statistics, but I did not want to include it on the list for the same reason that I did not want to include my own Bayesian Data Analysis on the list: I wanted to restrict to papers that developed original methods or syntheses. Arguably your paper is an original method in that it takes originality to apply a method developed in one field to another, just as arguably my book is an original synthesis because there’s a research contribution, not merely expository, to structuring a set of existing methods into a larger conceptual framework. Still, I felt more comfortable putting your paper and my book outside the box, because i wanted to focus on papers that were clearly developing something new in statistics.

          I agree that the line is not sharp but I still would like to draw the line somewhere. But in no way is this “line” a disparagement of work outside the line I’m drawing; I just think that contributions such as your paper and my book are a different sort from contributions such as those of Cox, White, and the others in the above list.

          As a separate point, I agree with you that my statement, “There’s something just so wrong about an application to the bootstrap having so many more citations than the original bootstrap paper!” is wrong. I hadn’t thought it through, and I think you’re completely right that this is how applied statistics often works, and that’s just fine.

  12. Marquardt, Donald W. “An algorithm for least-squares estimation of nonlinear parameters.” Journal of the Society for Industrial & Applied Mathematics 11.2 (1963): 431-441.
    21494 citations

    Bollerslev, Tim. “Generalized autoregressive conditional heteroskedasticity.” Journal of econometrics 31.3 (1986): 307-327.
    15553 citations

    Engle, Robert F., and Clive WJ Granger. “Co-integration and error correction: representation, estimation, and testing.” Econometrica: journal of the Econometric Society (1987): 251-276.
    23083 citations

  13. Citation frequency goes a bit counter the idea that science is not decided by votes but by being correct (or longest unfalsified, to adhere to Popper’s logic). What is a bigger concern is that there still is, though it differs from science to science, a lot of material published in other languages than English. this slows reception and certainly distorts the citation index (and vice versa, the index shows only part of the world in science but is taken for the whole, esp. by raters). Developmental psychologist Piaget was publishing in French in Geneva, neighboring on France, Italy and Germany. But his reception history in Germany began after the first English translation appeared in the US. Citations should really be seen in context. Equally one should compute relative indexes, like “how many citations per total number of papers in a field” or “… total number of authors in a field” etc. Certainly it makes a difference, if there are only ten papers published in a certain specialty in any given year and one of them gets cited in all of the ten, as against a field with thousands of papers where someone gets a hundred citations?!

  14. Pingback: These are the statistics papers you just have to read - Statistical Modeling, Causal Inference, and Social Science Statistical Modeling, Causal Inference, and Social Science

  15. Dempster, A. P., Laird, N. M., & Rubin, D. B. (1977). Maximum likelihood from incomplete data via the EM algorithm. Journal of the royal statistical society. Series B (methodological), 1-38.

    40499 citations

Leave a Reply

Your email address will not be published. Required fields are marked *