Most controversial posts of 2020

Last year we posted 635 entries on this blog. Above is a histogram of the number of comments on each of the posts. The bars are each of width 5, except that I made a special bar just for the posts with zero comments. There’s nothing special about zero here; some posts get only 1 or 2 comments, and some happen to get 0. Also, number of comments is not the same as number of views. I don’t have easy access to that sort of blog statistic, which is just as well or I might end up wasting time looking at it.

In any case, I wasn’t so thrilled with the histogram. I usually am not such a fan of histograms for displaying data, as the histogram involves this extra level of abstraction. Each bar is a category, not a data point. So I tried a time series plot. The posts are time-stamped but I was kind of lazy so I just plotted the number of comments in time order and then labeled the beginning, middle, and end of the time period on the x-axis:

And here’s a list of all the of last year’s posts, in decreasing order of number of comments. You can draw your own conclusions from all this.

  • “So the real scandal is: Why did anyone ever listen to this guy?” (636 comments)
  • Concerns with that Stanford study of coronavirus prevalence (431 comments)
  • Coronavirus in Sweden, what’s the story? (315 comments)
  • (Some) forecasting for COVID-19 has failed: a discussion of Taleb and Ioannidis et al. (224 comments)
  • So much of academia is about connections and reputation laundering (224 comments)
  • “I don’t want ‘crowd peer review’ or whatever you want to call it,” he said. “It’s just too burdensome and I’d rather have a more formal peer review process.” (203 comments)
  • Don’t kid yourself. The polls messed up—and that would be the case even if we’d forecasted Biden losing Florida and only barely winning the electoral college (197 comments)
  • Coronavirus Quickies (194 comments)
  • Reverse-engineering the problematic tail behavior of the Fivethirtyeight presidential election forecast (184 comments)
  • Coronavirus Grab Bag: deaths vs qalys, safety vs safety theater, ‘all in this together’, and more. (181 comments)
  • Math error in herd immunity calculation from CNN epidemiology expert (178 comments)
  • Years of Life Lost due to coronavirus (178 comments)
  • Unfair to James Watson? (177 comments)
  • 10 on corona (175 comments)
  • What would would mean to really take seriously the idea that our forecast probabilities were too far from 50%? (174 comments)
  • Facemasks in Germany (170 comments)
  • Coronavirus age-specific fatality ratio, estimated using Stan, and (attempting) to account for underreporting of cases and the time delay to death. Now with data and code. And now a link to another paper (also with data and code). (166 comments)
  • More coronavirus testing results, this time from Los Angeles (165 comments)
  • Hydroxychloroquine update (156 comments)
  • “RA Fisher and the science of hatred” (154 comments)
  • The p-value is 4.76×10^−264 1 in a quadrillion (153 comments)
  • I’m frustrated by the politicization of the coronavirus discussion. Here’s an example: (152 comments)
  • What’s the American Statistical Association gonna say in their Task Force on Statistical Significance and Replicability? (152 comments)
  • Given that 30% of Americans believe in astrology, it’s no surprise that some nontrivial percentage of influential American psychology professors are going to have the sort of attitude toward scientific theory and evidence that would lead them to have strong belief in weak theories supported by no good evidence. (151 comments)
  • Are GWAS studies of IQ/educational attainment problematic? (142 comments)
  • Association for Psychological Science takes a hard stand against criminal justice reform (141 comments)
  • In this particular battle between physicists and economists, I’m taking the economists’ side. (140 comments)
  • Male bisexuality gets Big PNAS Energy (137 comments)
  • “The Evidence and Tradeoffs for a ‘Stay-at-Home’ Pandemic Response: A multidisciplinary review examining the medical, psychological, economic and political impact of ‘Stay-at-Home’ implementation in America” (133 comments)
  • What are the best scientific papers ever written? (132 comments)
  • Would we be better off if randomized clinical trials had never been born? (131 comments)
  • Reverse-engineering priors in coronavirus discourse (131 comments)
  • New report on coronavirus trends: “the epidemic is not under control in much of the US . . . factors modulating transmission such as rapid testing, contact tracing and behavioural precautions are crucial to offset the rise of transmission associated with loosening of social distancing . . .” (130 comments)
  • “For the cost of running 96 wells you can test 960 people and accurate assess the prevalence in the population to within about 1%. Do this at 100 locations around the country and you’d have a spatial map of the extent of this epidemic today. . . and have this data by Monday.” (128 comments)
  • More on martingale property of probabilistic forecasts and some other issues with our election model (124 comments)
  • What can we learn from super-wide uncertainty intervals? (123 comments)
  • Retired computer science professor argues that decisions are being made by “algorithms that are mathematically incapable of bias.” What does this mean? (122 comments)
  • Don’t Hate Undecided Voters (122 comments)
  • “America is used to blaming individuals for systemic problems. Let’s try to avoid that this time.” (122 comments)
  • Association for Psychological Science claims that they can “add our voice and expertise to bring about positive change and to stand against injustice and racism in all forms” . . . but I’m skeptical. (121 comments)
  • University of Washington biostatistician unhappy with ever-changing University of Washington coronavirus projections (118 comments)
  • “What is the conclusion of a clinical trial where p=0.6?” (116 comments)
  • Discussion of uncertainties in the coronavirus mask study leads us to think about some issues . . . (115 comments)
  • The second derivative of the time trend on the log scale (also see P.S.) (115 comments)
  • Literally a textbook problem: if you get a positive COVID test, how likely is it that it’s a false positive? (114 comments)
  • NPR’s gonna NPR (special coronavirus junk science edition) (113 comments)
  • Do we really believe the Democrats have an 88% chance of winning the presidential election? (107 comments)
  • This one’s important: Designing clinical trials for coronavirus treatments and vaccines (106 comments)
  • Hilda Bastian and John Ioannidis on coronavirus decision making; Jon Zelner on virus progression models (106 comments)
  • Holes in Bayesian Statistics (106 comments)
  • Vaccine development as a decision problem (104 comments)
  • Problem of the between-state correlations in the Fivethirtyeight election forecast (103 comments)
  • The Economist not hedging the predictions (102 comments)
  • What about this idea of rapid antigen testing? (101 comments)
  • They want “statistical proof”—whatever that is! (101 comments)
  • No, I don’t believe that claim based on regression discontinuity analysis that . . . (101 comments)
  • Coronavirus: the cathedral or the bazaar, or the cathedral and the bazaar? (101 comments)
  • The seventy two percent solution (to police violence) (99 comments)
  • Coronavirus “hits all the hot buttons” for promoting the scientist-as-hero narrative (cognitive psychology edition) (99 comments)
  • Am I missing something here? This estimate seems off by several orders of magnitude! (98 comments)
  • What is the probability that someone you know will die from COVID-19 this year? (97 comments)
  • Flaxman et al. respond to criticisms of their estimates of effects of anti-coronavirus policies (96 comments)
  • Comparing election outcomes to our forecast and to the previous election (96 comments)
  • 17 state attorney generals, 100 congressmembers, and the Association for Psychological Science walk into a bar (95 comments)
  • Is vs. ought in the study of public opinion: Coronavirus “opening up” edition (95 comments)
  • Big trouble coming with the 2020 Census (94 comments)
  • Regression and Other Stories is available! (92 comments)
  • Where are the famous dogs? Where are the famous animals? (89 comments)
  • Where are the collaborative novels? (87 comments)
  • Thinking about election forecast uncertainty (87 comments)
  • Updates of bad forecasts: Let’s follow them up and see what happened! (85 comments)
  • Coronavirus PANIC news (85 comments)
  • Resolving the cathedral/bazaar problem in coronavirus research (and science more generally): Could we follow the model of genetics research (as suggested by some psychology researchers)? (84 comments)
  • Understanding Janet Yellen (83 comments)
  • Concerns with our Economist election forecast (83 comments)
  • Presidents as saviors vs. presidents as being hired to do a job (83 comments)
  • This controversial hydroxychloroquine paper: What’s Lancet gonna do about it? (83 comments)
  • Hey, I think something’s wrong with this graph! Free copy of Regression and Other Stories to the first commenter who comes up with a plausible innocent explanation of this one. (83 comments)
  • Debate involving a bad analysis of GRE scores (81 comments)
  • “Stay-at-home” behavior: A pretty graph but I have some questions (81 comments)
  • New analysis of excess coronavirus mortality; also a question about poststratification (81 comments)
  • Some wrong lessons people will learn from the president’s illness, hospitalization, and expected recovery (80 comments)
  • What happens to the median voter when the electoral median is at 52/48 rather than 50/50? (79 comments)
  • Moving blog to twitter (79 comments)
  • Estimating efficacy of the vaccine from 95 true infections (78 comments)
  • Is there a middle ground in communicating uncertainty in election forecasts? (77 comments)
  • Know your data, recode missing data codes (77 comments)
  • Conflicting public attitudes on redistribution (77 comments)
  • What’s Google’s definition of retractable? (76 comments)
  • OK, here’s a hierarchical Bayesian analysis for the Santa Clara study (and other prevalence studies in the presence of uncertainty in the specificity and sensitivity of the test) (75 comments)
  • Causal inference in AI: Expressing potential outcomes in a graphical-modeling framework that can be fit using Stan (74 comments)
  • Which experts should we trust? (73 comments)
  • “No one is going to force you to write badly. In the long run, you won’t even be rewarded for it. But, unfortunately, it is true that they’ll often let you get away with it.” (73 comments)
  • A better way to visualize the spread of coronavirus in different countries? (73 comments)
  • What went wrong with the polls in 2020? Another example. (72 comments)
  • The Pfizer-Biontech Vaccine May Be A Lot More Effective Than You Think? (72 comments)
  • So, what’s with that claim that Biden has a 96% chance of winning? (some thoughts with Josh Miller) (72 comments)
  • More on that Fivethirtyeight prediction that Biden might only get 42% of the vote in Florida (72 comments)
  • FDA statistics scandal update (71 comments)
  • Who were the business superstars of the 1970s? (71 comments)
  • Imperial College report on Italy is now up (71 comments)
  • Blog about a column about the Harper’s letter: Here’s some discourse about a discourse about what happens when the discourse takes precedence over reality (70 comments)
  • Comments on the new fivethirtyeight.com election forecast (69 comments)
  • The history of low-hanging intellectual fruit (69 comments)
  • New York coronavirus antibody study: Why I had nothing to say to the press on this one. (69 comments)
  • Why it can be rational to vote (68 comments)
  • That “not a real doctor” thing . . . It’s kind of silly for people to think that going to medical school for a few years will give you the skills necessary to be able to evaluate research claims in medicine or anything else. (68 comments)
  • Parking lot statistics—a story in three parts (68 comments)
  • RCT on use of cloth vs surgical masks (68 comments)
  • “How to be Curious Instead of Contrarian About COVID-19: Eight Data Science Lessons From Coronavirus Perspective” (68 comments)
  • One dose or two? This epidemiologist suggests we should follow Bugs Bunny and go for two. (67 comments)
  • Calibration and recalibration. And more recalibration. IHME forecasts by publication date (67 comments)
  • New coronavirus forecasting model (67 comments)
  • Thomas Basbøll will like this post (analogy between common—indeed, inevitable—mistakes in drawing, and inevitable mistakes in statistical reasoning). (66 comments)
  • The War on Data: Now we play the price (66 comments)
  • Statistical controversy on estimating racial bias in the criminal justice system (66 comments)
  • Why X’s think they’re the best (66 comments)
  • How scientists perceive advancement of knowledge from conflicting review reports (66 comments)
  • The best coronavirus summary so far (66 comments)
  • Hey! Let’s check the calibration of some coronavirus forecasts. (66 comments)
  • It’s kinda like phrenology but worse. Not so good for the “Nature” brand name, huh? Measurement, baby, measurement. (65 comments)
  • Randomized but unblinded experiment on vitamin D as a coronavirus treatment. Let’s talk about what comes next. (Hint: it involves multilevel models.) (65 comments)
  • No, they won’t share their data. (65 comments)
  • Are female scientists worse mentors? This study pretends to know (64 comments)
  • “Stop me if you’ve heard this one before: Ivy League law professor writes a deepthoughts think piece explaining a seemingly irrational behavior that doesn’t actually exist.” (64 comments)
  • We taught a class using Zoom yesterday. Here’s what we learned. (64 comments)
  • Are we constantly chasing after these population-level effects of these non-pharmaceutical interventions that are hard to isolate when there are many good reasons to believe in their efficacy in the first instance? (62 comments)
  • Decision-making under uncertainty: heuristics vs models (62 comments)
  • So . . . what about that claim that probabilistic election forecasts depress voter turnout? (62 comments)
  • MIT’s science magazine misrepresents critics of Stanford study (61 comments)
  • bla bla bla PEER REVIEW bla bla bla (61 comments)
  • Are informative priors “[in]compatible with standards of research integrity”? Click to find out!! (61 comments)
  • Coronavirus model update: Background, assumptions, and room for improvement (61 comments)
  • The Paterno Defence: Gladwell’s Tipping Point? (61 comments)
  • “Psychology’s Zombie Ideas” (60 comments)
  • Junk Science Then and Now (60 comments)
  • Cops’ views (59 comments)
  • Expert writes op-ed in NYT recommending that we trust the experts (59 comments)
  • Against overly restrictive definitions: No, I don’t think it helps to describe Bayes as “the analysis of subjective
 beliefs” (nor, for that matter, does it help to characterize the statements of Krugman or Mankiw as not being “economics”) (58 comments)
  • This one’s for the Lancet editorial board: A trolley problem for our times (involving a plate of delicious cookies and a steaming pile of poop) (58 comments)
  • Understanding the “average treatment effect” number (57 comments)
  • Fake MIT journalists misrepresent real Buzzfeed journalist. (Maybe we shouldn’t be so surprised?) (57 comments)
  • Information or Misinformation During a Pandemic: Comparing the effects of following Nassim Taleb, Richard Epstein, or Cass Sunstein on twitter. (57 comments)
  • Putting Megan Higgs and Thomas Basbøll in the room together (57 comments)
  • Steven Pinker on torture (57 comments)
  • No, I don’t think that this study offers good evidence that installing air filters in classrooms has surprisingly large educational benefits. (57 comments)
  • How to think about extremely unlikely events (such as Biden winning Alabama, Trump winning California, or Biden winning Ohio but losing the election)? (56 comments)
  • In case you’re wondering . . . this is why the U.S. health care system is the most expensive in the world (56 comments)
  • This is not a post about remdesivir. (56 comments)
  • Do these data suggest that UPS, Amazon, etc., should be quarantining packages? (56 comments)
  • Does this fallacy have a name? (55 comments)
  • The challenge of fitting “good advice” into a coherent course on statistics (55 comments)
  • “Estimating the effects of non-pharmaceutical interventions on COVID-19 in Europe” (55 comments)
  • Updated Santa Clara coronavirus report (55 comments)
  • Somethings do not seem to spread easily – the role of simulation in statistical practice and perhaps theory. (54 comments)
  • 2 perspectives on the relevance of social science to our current predicament: (1) social scientists should back off, or (2) social science has a lot to offer (54 comments)
  • How to get out of the credulity rut (regression discontinuity edition): Getting beyond whack-a-mole (54 comments)
  • BMJ update: authors reply to our concerns (but I’m not persuaded) (53 comments)
  • “Sorry, there is no peer review to display for this article.” Huh? Whassup, BMJ? (53 comments)
  • No, It’s Not a Prisoner’s Dilemma (the second in a continuing series): (52 comments)
  • “Inferring the effectiveness of government interventions against COVID-19” (52 comments)
  • Coding and drawing (52 comments)
  • Resolving confusions over that study of “Teacher Effects on Student Achievement and Height” (52 comments)
  • An open letter expressing concerns regarding the statistical analysis and data integrity of a recently published and publicized paper (52 comments)
  • Career advice for a future statistician (52 comments)
  • Quine’s be Quining (51 comments)
  • Political polarization of professions (51 comments)
  • Econ grad student asks, “why is the government paying us money, instead of just firing us all?” (51 comments)
  • “As a girl, she’d been very gullible, but she had always learned more that way.” (51 comments)
  • Four projects in the intellectual history of quantitative social science (51 comments)
  • “Dream Investigation Results: Official Report by the Minecraft Speedrunning Team” (50 comments)
  • Probabilistic forecasts cause general misunderstanding. What to do about this? (50 comments)
  • But the top graph looked like such strong evidence! (50 comments)
  • “Curing Coronavirus Isn’t a Job for Social Scientists” (50 comments)
  • 2 econ Nobel prizes, 1 error (49 comments)
  • No, average statistical power is not as high as you think: Tracing a statistical error as it spreads through the literature (49 comments)
  • What a difference a month makes (polynomial extrapolation edition) (49 comments)
  • “In the world of educational technology, the future actually is what it used to be” (48 comments)
  • Post-election post (48 comments)
  • Simple Bayesian analysis inference of coronavirus infection rate from the Stanford study in Santa Clara county (48 comments)
  • Negativity (when applied with rigor) requires more care than positivity. (47 comments)
  • Mike Pence and Rush Limbaugh on smoking, cancer, and the coronavirus (47 comments)
  • Linear or logistic regression with binary outcomes (47 comments)
  • The importance of descriptive social science and its relation to causal inference and substantive theories (46 comments)
  • Drunk-under-the-lamppost testing (46 comments)
  • Can we stop talking about how we’re better off without election forecasting? (45 comments)
  • Calibration problem in tails of our election forecast (45 comments)
  • How should those Lancet/Surgisphere/Harvard data have been analyzed? (45 comments)
  • Alexey Guzey’s sleep deprivation self-experiment (45 comments)
  • Some recommendations for design and analysis of clinical trials, with application to coronavirus (45 comments)
  • The NeurIPS 2020 broader impacts experiment (44 comments)
  • UX issues around voting (44 comments)
  • Thank you, James Watson. Thank you, Peter Ellis. (Lancet: You should do the right thing and credit them for your retraction. Actually, do one better and invite them to write a joint editorial in your journal.) (44 comments)
  • Who are you gonna believe, me or your lying eyes? (43 comments)
  • Don’t say your data “reveal quantum nature of human judgments.” Be precise and say your data are “consistent with a quantum-inspired model of survey responses.” Yes, then your paper might not appear in PNAS, but you’ll feel better about yourself in the morning. (43 comments)
  • “The Generalizability Crisis” in the human sciences (43 comments)
  • Statistics is hard, especially if you don’t know any statistics (FDA edition) (42 comments)
  • Shortest posterior intervals (42 comments)
  • Election 2020 is coming: Our poll aggregation model with Elliott Morris of the Economist (42 comments)
  • “Banishing ‘Black/White Thinking’: A Trio of Teaching Tricks” (42 comments)
  • Stan pedantic mode (42 comments)
  • I’m still struggling to understand hypothesis testing . . . leading to a more general discussion of the role of assumptions in statistics (42 comments)
  • Just some numbers from Canada (42 comments)
  • Bishops of the Holy Church of Embodied Cognition and editors of the Proceedings of the National Academy of Christ (41 comments)
  • (1) The misplaced burden of proof, and (2) selection bias: Two reasons for the persistence of hype in tech and science reporting (41 comments)
  • Coronavirus disparities in Palestine and in Michigan (41 comments)
  • Some thoughts inspired by Lee Cronbach (1975), “Beyond the two disciplines of scientific psychology” (41 comments)
  • Some thoughts on another failed replication in psychology (41 comments)
  • Some of you must have an idea of the answer to this one. (41 comments)
  • Interesting y-axis (41 comments)
  • Some questions from high school students about college and future careers (40 comments)
  • “The Intellectuals and the Masses” (40 comments)
  • “Positive Claims get Publicity, Refutations do Not: Evidence from the 2020 Flu” (40 comments)
  • New dataset: coronavirus tracking using data from smart thermometers (40 comments)
  • Is there any scientific evidence that humans don’t like uncertainty? (40 comments)
  • Votes vs. $ (40 comments)
  • Does regression discontinuity (or, more generally, causal identification + statistical significance) make you gullible? (39 comments)
  • Risk aversion is not a thing (39 comments)
  • Getting all negative about so-called average power (39 comments)
  • More than one, always more than one to address the real uncertainty. (39 comments)
  • Pandemic cats following social distancing (39 comments)
  • A new hot hand paradox (38 comments)
  • “Postmortem of a replication drama in computer science” (38 comments)
  • Low rate of positive coronavirus tests (38 comments)
  • Can someone build a Bayesian tool that takes into account your symptoms and where you live to estimate your probability of having coronavirus? (38 comments)
  • The intellectual explosion that didn’t happen (38 comments)
  • What are my statistical principles? (37 comments)
  • BMJ FAIL: The system is broken. (Some reflections on bad research, scientism, the importance of description, and the challenge of negativity) (37 comments)
  • The value of thinking about varying treatment effects: coronavirus example (37 comments)
  • “The good news about this episode is that it’s kinda shut up those people who were criticizing that Stanford antibody study because it was an un-peer-reviewed preprint. . . .” and a P.P.P.S. with Paul Alper’s line about the dead horse (37 comments)
  • Last post on hydroxychloroquine (perhaps) (37 comments)
  • Doubts about that article claiming that hydroxychloroquine/chloroquine is killing people (37 comments)
  • If the outbreak ended, does that mean the interventions worked? (Jon Zelner talk tomorrow) (37 comments)
  • “1919 vs. 2020” (37 comments)
  • Vaping statistics controversy update: A retraction and some dispute (37 comments)
  • Researcher offers ridiculous reasons for refusing to reassess work in light of serious criticism (37 comments)
  • Forget about multiple testing corrections. Actually, forget about hypothesis testing entirely. (37 comments)
  • Advice for a yoga studio that wants to reopen? (36 comments)
  • Harvard-laundering (the next stage of the Lancet scandal) (36 comments)
  • Breaking the feedback loop: When people don’t correct their errors (36 comments)
  • Woof! for descriptive statistics (36 comments)
  • “New research suggests Sanders would drive swing voters to Trump — and need a youth turnout miracle to compensate.” (36 comments)
  • Evidence-based medicine eats itself (36 comments)
  • “We’ve got to look at the analyses, the real granular data. It’s always tough when you’re looking at a press release to figure out what’s going on.” (35 comments)
  • My proposal is to place criticism within the scientific, or social-scientific, enterprise, rather than thinking about it as something coming from outside, or as something that is tacked on at the end. (35 comments)
  • Reasoning under uncertainty (35 comments)
  • Considerate Swedes only die during the week. (35 comments)
  • “Older Americans are more worried about coronavirus — unless they’re Republican” (35 comments)
  • What do Americans think about coronavirus restrictions? Let’s see what the data say . . . (34 comments)
  • “I Can’t Believe It’s Not Better” (34 comments)
  • The view that the scientific process is “red tape,” just a bunch of hoops you need to jump through so you can move on with your life (34 comments)
  • Coronavirus jailbreak (34 comments)
  • The value (or lack of value) of preregistration in the absence of scientific theory (34 comments)
  • Deterministic thinking meets the fallacy of the one-sided bet (33 comments)
  • Derived quantities and generative models (33 comments)
  • Update on social science debate about measurement of discrimination (33 comments)
  • Surgisphere scandal: Lancet still doesn’t get it (33 comments)
  • You don’t want a criminal journal… you want a criminal journal (33 comments)
  • Is causality as explicit in fake data simulation as it should be? (32 comments)
  • How much of public health work “involves not technology but methodicalness and record keeping”? (32 comments)
  • Challenges to the Reproducibility of Machine Learning Models in Health Care; also a brief discussion about not overrating randomized clinical trials (32 comments)
  • Baby alligators: Adorable, deadly, or endangered? You decide. (32 comments)
  • Updated Imperial College coronavirus model, including estimated effects on transmissibility of lockdown, social distancing, etc. (32 comments)
  • Are we ready to move to the “post p < 0.05 world”? (32 comments)
  • No, I don’t believe etc etc., even though they did a bunch of robustness checks. (31 comments)
  • Coronavirus dreams (31 comments)
  • “The Taboo Against Explicit Causal Inference in Nonexperimental Psychology” (31 comments)
  • Please socially distance me from this regression model! (31 comments)
  • Is JAMA potentially guilty of manslaughter? (31 comments)
  • Is data science a discipline? (31 comments)
  • Abuse of expectation notation (31 comments)
  • The latest Perry Preschool analysis: Noisy data + noisy methods + flexible summarizing = Big claims (31 comments)
  • What are the most important statistical ideas of the past 50 years? (30 comments)
  • How to think about correlation? It’s the slope of the regression when x and y have been standardized. (30 comments)
  • “Fake Facts in Covid-19 Science: Kentucky vs. Tennessee.” (30 comments)
  • Their findings don’t replicate, but they refuse to admit they might’ve messed up. (We’ve seen this movie before.) (30 comments)
  • The typical set and its relevance to Bayesian computation (30 comments)
  • In Bayesian inference, do people cheat by rigging the prior? (30 comments)
  • My theory of why TV sports have become less popular (29 comments)
  • An example of a parallel dot plot: a great way to display many properties of a list of items (29 comments)
  • “MIT Built a Theranos for Plants” (29 comments)
  • Who was the first literary schlub? (29 comments)
  • “Frequentism-as-model” (29 comments)
  • The “scientist as hero” narrative (29 comments)
  • In Bayesian priors, why do we use soft rather than hard constraints? (29 comments)
  • They added a hierarchical structure to their model and their parameter estimate changed a lot: How to think about this? (29 comments)
  • Merlin did some analysis of possible electoral effects of rejections of vote-by-mail ballots . . . (28 comments)
  • Everything that can be said can be said clearly. (28 comments)
  • Statistics controversies from the perspective of industrial statistics (28 comments)
  • The return of the red state blue state fallacy (28 comments)
  • How the election might have looked in a world without polls (27 comments)
  • All maps of parameter estimates are (still) misleading (27 comments)
  • Reference for the claim that you need 16 times as much data to estimate interactions as to estimate main effects (27 comments)
  • Florida. Comparing Economist and Fivethirtyeight forecasts. (27 comments)
  • Kafka comes to the visa office (27 comments)
  • Super-duper online matrix derivative calculator vs. the matrix normal (for Stan) (27 comments)
  • We need better default plots for regression. (27 comments)
  • Hey, you. Yeah, you! Stop what you’re doing RIGHT NOW and read this Stigler article on the history of robust statistics (27 comments)
  • Number of deaths or number of deaths per capita (27 comments)
  • Stasi’s back in town. (My last post on Cass Sunstein and Richard Epstein.) (27 comments)
  • Basbøll’s Audenesque paragraph on science writing, followed by a resurrection of a 10-year-old debate on Gladwell (26 comments)
  • Between-state correlations and weird conditional forecasts: the correlation depends on where you are in the distribution (26 comments)
  • His data came out in the opposite direction of his hypothesis. How to report this in the publication? (26 comments)
  • Here’s a question for the historians of science out there: How modern is the idea of a scientific “anomaly”? (26 comments)
  • “Why do the results of immigrant students depend so much on their country of origin and so little on their country of destination?” (26 comments)
  • Uncertainty and variation as distinct concepts (26 comments)
  • ,26 comments)
  • An article in a statistics or medical journal, “Using Simulations to Convince People of the Importance of Random Variation When Interpreting Statistics.” (26 comments)
  • Is it really true that candidates who are perceived as ideologically extreme do even worse if “they actually pose as more radical than they really are”? (26 comments)
  • I ain’t the lotus (25 comments)
  • “A better way to roll out Covid-19 vaccines: Vaccinate everyone in several hot zones”? (25 comments)
  • Here’s why rot13 text looks so cool. (25 comments)
  • Whassup with the dots on our graph? (25 comments)
  • The NBA strike and what does it take to keep stories in the news (25 comments)
  • Can the science community help journalists avoid science hype? It won’t be easy. (25 comments)
  • Probabilities for action and resistance in Blades in the Dark (25 comments)
  • How good is the Bayes posterior for prediction really? (25 comments)
  • How to describe Pfizer’s beta(0.7, 1) prior on vaccine effect? (24 comments)
  • I like this way of mapping electoral college votes (24 comments)
  • Alexey Guzey plays Stat Detective: How many observations are in each bar of this graph? (24 comments)
  • Association Between Universal Curve Fitting in a Health Care Journal and Journal Acceptance Among Health Care Researchers (24 comments)
  • “In any case, we have a headline optimizer that A/B tests different headlines . . .” (24 comments)
  • Marc Hauser: Victim of statistics? (24 comments)
  • How many patients do doctors kill by accident? (24 comments)
  • Why I Rant (24 comments)
  • IEEE’s Refusal to Issue Corrections (23 comments)
  • 53 fever! (23 comments)
  • Tracking R of COVID-19 & assessing public interventions; also some general thoughts on science (23 comments)
  • The Shrinkage Trilogy: How to be Bayesian when analyzing simple experiments (22 comments)
  • Follow-up on yesterday’s posts: some maps are less misleading than others. (22 comments)
  • A question of experimental design (more precisely, design of data collection) (22 comments)
  • “Figure 1 looks like random variation to me” . . . indeed, so it does. And Figure 2 as well! But statistical significance was found, so this bit of randomness was published in a top journal. Business as usual in the statistical-industrial complex. Still, I’d hope the BMJ could’ve done better. (22 comments)
  • “To Change the World, Behavioral Intervention Research Will Need to Get Serious About Heterogeneity” (22 comments)
  • Election odds update (Biden still undervalued but not by so much) (22 comments)
  • Computer-generated writing that looks real; real writing that looks computer-generated (22 comments)
  • Three unblinded mice (21 comments)
  • We want certainty even when it’s not appropriate (21 comments)
  • The flashy crooks get the headlines, but the bigger problem is everyday routine bad science done by non-crooks (21 comments)
  • Priors on effect size in A/B testing (21 comments)
  • We need to practice our best science hygiene. (21 comments)
  • Naming conventions for variables, functions, etc. (21 comments)
  • Different challenges in replication in biomedical vs. social sciences (21 comments)
  • Intended consequences are the worst (21 comments)
  • The fallacy of the excluded rationality (21 comments)
  • No, this senatorial stock-picking study does not address concerns about insider trading: (20 comments)
  • “It’s turtles for quite a way down, but at some point it’s solid bedrock.” (20 comments)
  • The rise and fall and rise of randomized controlled trials (RCTs) in international development (20 comments)
  • Why is this graph actually ok? It’s the journey, not just the destination. (20 comments)
  • Body language and machine learning (20 comments)
  • Estimated “house effects” (biases of pre-election surveys from different pollsters) and here’s why you have to be careful not to overinterpret them: (20 comments)
  • “Congressional Representation: Accountability from the Constituent’s Perspective” (20 comments)
  • From monthly return rate to importance sampling to path sampling to the second law of thermodynamics to metastable sampling in Stan (20 comments)
  • Bolivia election fraud fraud update (20 comments)
  • The New Yorker fiction podcast: how it’s great and how it could be improved (20 comments)
  • David Leavitt and Meg Wolitzer (20 comments)
  • Top 5 literary descriptions of poker (20 comments)
  • Open forensic science, and some general comments on the problems of legalistic thinking when discussing open science (20 comments)
  • Bayesian Workflow (19 comments)
  • “this large reduction in response rats” (19 comments)
  • Varimax: Sure, it’s always worked but now there’s maths! (19 comments)
  • New England Journal of Medicine engages in typical academic corporate ass-covering behavior (19 comments)
  • Roll Over Mercator: Awesome map shows the unreasonable effectiveness of mixture models (19 comments)
  • Retraction of racial essentialist article that appeared in Psychological Science (19 comments)
  • How unpredictable is the 2020 election? (19 comments)
  • Fit nonlinear regressions in R using stan_nlmer (19 comments)
  • Birthdays! (19 comments)
  • “Repeating the experiment” as general advice on data collection (19 comments)
  • Of Manhattan Projects and Moonshots (19 comments)
  • Greek statistician is in trouble for . . . telling the truth! (18 comments)
  • Stephen Wolfram invented a time machine but has been too busy to tell us about it (18 comments)
  • Sleep injury spineplot (18 comments)
  • Battle of the open-science asymmetries (18 comments)
  • Do we trust this regression? (18 comments)
  • The checklist manifesto and beyond (18 comments)
  • This study could be just fine, or not. Maybe I’ll believe it if there’s an independent preregistered replication. (18 comments)
  • BREAKING: MasterClass Announces NEW Class on Science of Sleep by Neuroscientist & Sleep Expert Matthew Walker – Available NOW (17 comments)
  • More on the Heckman curve (17 comments)
  • Pre-register post-election analyses? (17 comments)
  • She’s wary of the consensus based transparency checklist, and here’s a paragraph we should’ve added to that zillion-authored paper (17 comments)
  • Covid-19 -> Kovit-17 (following the himmicanes principle) (17 comments)
  • Automatic data reweighting! (17 comments)
  • Theorizing, thought experiments, fake-data simulation (17 comments)
  • Further debate over mindset interventions (17 comments)
  • Ugly code is buggy code (17 comments)
  • The two most important formulas in statistics (17 comments)
  • Create your own community (if you need to) (17 comments)
  • Estimating the mortality rate from corona? (17 comments)
  • How much of Trump’s rising approval numbers can be attributed to differential nonresponse? P.S. With more analysis of recent polls from Jacob Long (17 comments)
  • Royal Society spam & more (17 comments)
  • Response to a question about a reference in one of our papers (16 comments)
  • What does it take to be omniscient? (16 comments)
  • The turtles stop here. Why we meta-science: a meta-meta-science manifesto (16 comments)
  • “The Moral Economy of Science” (16 comments)
  • Here’s what academic social, behavioral, and economic scientists should be working on right now. (16 comments)
  • Make Andrew happy with one simple ggplot trick (16 comments)
  • Conference on Mister P online tomorrow and Saturday, 3-4 Apr 2020 (16 comments)
  • My best thoughts on priors (16 comments)
  • MRP Carmelo Anthony update . . . Trash-talking’s fine. But you gotta give details, or links, or something! (16 comments)
  • Which teams have fewer fans than their namesake? I pretty much like this person’s reasoning except when we get to the chargers and raiders. (16 comments)
  • Is it accurate to say, “Politicians Don’t Actually Care What Voters Want”? (16 comments)
  • Of book reviews and selection bias (16 comments)
  • More limitations of cross-validation and actionable recommendations (15 comments)
  • “Time Travel in the Brain” (15 comments)
  • One more Bolivia election fraud fraud thing (15 comments)
  • Using the rank-based inverse normal transformation (15 comments)
  • Come up with a logo for causal inference! (15 comments)
  • How to “cut” using Stan, if you must (15 comments)
  • This graduate student wants to learn statistics to be a better policy analyst (15 comments)
  • Don’t ever change, social psychology! You’re perfect just the way you are (14 comments)
  • Rob Kass: “The truth of a theory is contingent on both our state of knowledge and the purposes to which it will be put.” (14 comments)
  • David Spiegelhalter wants a checklist for quality control of statistical models? (14 comments)
  • Heckman Curve Update Update (14 comments)
  • Dispelling confusion about MRP (multilevel regression and poststratification) for survey analysis (14 comments)
  • If variation in effects is so damn important and so damn obvious, why do we hear so little about it? (14 comments)
  • Conway II (14 comments)
  • Webinar on approximate Bayesian computation (14 comments)
  • Noise-mining as standard practice in social science (14 comments)
  • He’s annoyed that PNAS desk-rejected his article. (14 comments)
  • A factor of 40 speed improvement . . . that’s not something that happens every day! (14 comments)
  • Advice for a Young Economist at Heart (14 comments)
  • Unlike MIT, Scientific American does the right thing and flags an inaccurate and irresponsible article that they mistakenly published. Here’s the story: (13 comments)
  • Today in spam (13 comments)
  • Bayesian Workflow (my talk this Wed at Criteo) (13 comments)
  • What can be our goals, and what is too much to hope for, regarding robust statistical procedures? (13 comments)
  • Age-period-cohort analysis. (13 comments)
  • Let’s do preregistered replication studies of the cognitive effects of air pollution—not because we think existing studies are bad, but because we think the topic is important and we want to understand it better. (13 comments)
  • “Lessons from First Online Teaching Experience after COVID-19 Regulations” (13 comments)
  • Monte Carlo and winning the lottery (13 comments)
  • Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation: Bayesian inference for latent Gaussian models and beyond (12 comments)
  • The 200-year-old mentor (12 comments)
  • As a forecaster, how important is it to “have a few elections under your belt”? (12 comments)
  • Fiction as a window into other cultures (12 comments)
  • Quino y Mafalda (12 comments)
  • “Pictures represent facts, stories represent acts, and models represent concepts.” (12 comments)
  • Rethinking Rob Kass’ recent talk on science in a less statistics-centric way. (12 comments)
  • Hilarious reply-all loop (12 comments)
  • Hey, this was an unusual media request (12 comments)
  • Be careful when estimating years of life lost: quick-and-dirty estimates of attributable risk are, well, quick and dirty. (12 comments)
  • Best econ story evah (12 comments)
  • John Conway (12 comments)
  • Model building is Lego, not Playmobil. (toward understanding statistical workflow) (12 comments)
  • Bernie electability update (12 comments)
  • As usual, I agree with Paul Meehl: “It is not a reform of significance testing as currently practiced in soft-psych. We are making a more heretical point than any of these: We are attacking the whole tradition of null-hypothesis refutation as a way of appraising theories.” (12 comments)
  • What can we do with complex numbers in Stan? (12 comments)
  • Calling all cats (12 comments)
  • When I was asked, Who do you think is most likely to win the Democratic nomination?, this is how I responded . . . (12 comments)
  • The importance of measurement in psychology (12 comments)
  • How many infectious people are likely to show up at an event? (11 comments)
  • The likelihood principle in model check and model evaluation (11 comments)
  • From the Archives of Psychological Science (11 comments)
  • Sh*ttin brix in the tail… (11 comments)
  • Social science and the replication crisis (my talk this Thurs 8 Oct) (11 comments)
  • The U.S. high school math olympiad champions of the 1970s and 1980s: Where were they then? (11 comments)
  • Getting negative about the critical positivity ratio: when you talk about throwing out the bathwater, really throw out the bathwater! Don’t try to pretend it has some value. Give it up. Let it go. You can do this and still hold on to the baby at the same time! (11 comments)
  • This one quick trick will allow you to become a star forecaster (11 comments)
  • They want open peer review for their paper, and they want it now. Any suggestions? (11 comments)
  • Bayesian analysis of Santa Clara study: Run it yourself in Google Collab, play around with the model, etc! (11 comments)
  • BDA FREE (Bayesian Data Analysis now available online as pdf) (11 comments)
  • “Estimating Covid-19 prevalence using Singapore and Taiwan” (11 comments)
  • Estimates of the severity of COVID-19 disease: another Bayesian model with poststratification (11 comments)
  • The Road Back (11 comments)
  • Don’t talk about hypotheses as being “either confirmed, partially confirmed, or rejected” (11 comments)
  • My review of Ian Stewart’s review of my review of his book (11 comments)
  • “End of novel. Beginning of job.”: That point at which you make the decision to stop thinking and start finishing (10 comments)
  • If— (10 comments)
  • You can figure out the approximate length of our blog lag now. (10 comments)
  • A very short statistical consulting story (10 comments)
  • Further formalization of the “multiverse” idea in statistical modeling (10 comments)
  • Bees have five eyes (10 comments)
  • “Everybody wants to be Jared Diamond” (10 comments)
  • Information, incentives, and goals in election forecasts (10 comments)
  • Taking the bus (10 comments)
  • Why we kept the trig in golf: Mathematical simplicity is not always the same as conceptual simplicity (10 comments)
  • “I just wanted to say that for the first time in three (4!?) years of efforts, I have a way to estimate my model. . . .” (10 comments)
  • More on absolute error vs. relative error in Monte Carlo methods (10 comments)
  • Himmicanes again (10 comments)
  • “Which, in your personal judgment, is worse, if you could only choose ONE? — (a) A homosexual (b) A doctor who refuses to make a house call to someone seriously ill?” (10 comments)
  • This one’s important: Bayesian workflow for disease transmission modeling in Stan (10 comments)
  • New Within-Chain Parallelisation in Stan 2.23: This One‘s Easy for Everyone! (10 comments)
  • And the band played on: Low quality studies being published on Covid19 prediction. (10 comments)
  • Why We Sleep—a tale of non-replication. (9 comments)
  • Update on IEEE’s refusal to issue corrections (9 comments)
  • Publishing in Antarctica (9 comments)
  • An odds ratio of 30, which they (sensibly) don’t believe (9 comments)
  • If something is being advertised as “incredible,” it probably is. (9 comments)
  • Bill James is back (9 comments)
  • “100 Stories of Causal Inference”: My talk tomorrow at the Online Causal Inference Seminar (9 comments)
  • On deck through Jan 2021 (9 comments)
  • No, there is no “tension between getting it fast and getting it right” (9 comments)
  • Improving our election poll aggregation model (9 comments)
  • Two good news articles on trends in baseball analytics (9 comments)
  • Faster than ever before: Hamiltonian Monte Carlo using an adjoint-differentiated Laplace approximation (9 comments)
  • Statistical Workflow and the Fractal Nature of Scientific Revolutions (my talk this Wed at the Santa Fe Institute) (9 comments)
  • Blast from the past (9 comments)
  • Standard deviation, standard error, whatever! (9 comments)
  • It’s “a single arena-based heap allocation” . . . whatever that is! (9 comments)
  • Rodman (9 comments)
  • Controversy regarding the effectiveness of Remdesivir (9 comments)
  • Himmicanes! (9 comments)
  • Upholding the patriarchy, one blog post at a time (9 comments)
  • “Non-disclosure is not just an unfortunate, but unfixable, accident. A methodology can be disclosed at any time.” (9 comments)
  • Conditioning on a statistical method as a “meta” version of conditioning on a statistical model (9 comments)
  • The hot hand fallacy fallacy rears its ugly ugly head (9 comments)
  • Rao-Blackwellization and discrete parameters in Stan (9 comments)
  • Are the tabloids better than we give them credit for? (9 comments)
  • Graphs of school shootings in the U.S. (9 comments)
  • How science and science communication really work: coronavirus edition (8 comments)
  • Stop-and-frisk data (8 comments)
  • “Day science” and “Night science” are the same thing—if done right! (8 comments)
  • Coronavirus corrections, data sources, and issues. (8 comments)
  • A Collection of Word Oddities and Trivia (8 comments)
  • Misleading vote reporting (8 comments)
  • Fugitive and cloistered virtues (8 comments)
  • Correctness (8 comments)
  • Progress in the past decade (8 comments)
  • Prediction markets and election forecasts (7 comments)
  • “Model takes many hours to fit and chains don’t converge”: What to do? My advice on first steps. (7 comments)
  • Stacking for Non-mixing Bayesian Computations: The Curse and Blessing of Multimodal Posteriors (7 comments)
  • The point here is not the face masks; it’s the impossibility of assumption-free causal inference when the different treatments are entangled in this way. (7 comments)
  • “Worthwhile content in PNAS” (7 comments)
  • The Fall Guy, by James Lasdun (7 comments)
  • Structural equation modeling and Stan (7 comments)
  • “Sometimes research just has to start somewhere, and subject itself to criticism and potential improvement.” (7 comments)
  • “It just happens to be in the nature of knowledge that it cannot be conserved if it does not grow.” (7 comments)
  • Pocket Kings by Ted Heller (7 comments)
  • On deck for the first half of 2020 (7 comments)
  • Red Team prepublication review update (6 comments)
  • What George Michael’s song Freedom! was really about (6 comments)
  • Mister P for the 2020 presidential election in Belarus (6 comments)
  • Lying with statistics (6 comments)
  • Public health researchers explain: “Death by despair” is a thing, but not the biggest thing (6 comments)
  • Interactive analysis needs theories of inference (6 comments)
  • Stan receives its second Nobel prize. (6 comments)
  • Misrepresenting data from a published source . . . it happens all the time! (6 comments)
  • Post-stratified longitudinal item response model for trust in state institutions in Europe (6 comments)
  • Aki’s talk about reference models in model selection in Laplace’s demon series (6 comments)
  • Get your research project reviewed by The Red Team: this seems like a good idea! (6 comments)
  • “Then the flaming sheet, with the whirr of a liberated phoenix, would fly up the chimney to join the stars.” (6 comments)
  • More coronavirus research: Using Stan to fit differential equation models in epidemiology (6 comments)
  • OHDSI COVID-19 study-a-thon. (6 comments)
  • 100 Things to Know, from Lane Kenworthy (6 comments)
  • Nonparametric Bayes webinar (5 comments)
  • Piranhas in the rain: Why instrumental variables are not as clean as you might have thought (5 comments)
  • Election Scenario Explorer using Economist Election Model (5 comments)
  • Election forecasts: The math, the goals, and the incentives (my talk this Friday afternoon at Cornell University) (5 comments)
  • Parallel in Stan (5 comments)
  • My talk this Wed 7:30pm (NY time) / Thurs 9:30am (Australian time) at the Victorian Centre for Biostatistics (5 comments)
  • Usual channels of clinical research dissemination getting somewhat clogged: What can go wrong – does. (5 comments)
  • Corona virus presentation by the Dutch CDC, also some thoughts on the audience for these sorts of presentations (5 comments)
  • Prior predictive, posterior predictive, and cross-validation as graphical models (5 comments)
  • WE HAVE A VERY IMPORTANT ANNOUNCEMENT . . . (5 comments)
  • Smoothness, or lack thereof, in MRP estimates over time (5 comments)
  • Le Detection Club (4 comments)
  • Authors repeat same error in 2019 that they acknowledged and admitted was wrong in 2015 (4 comments)
  • “Valid t-ratio Inference for instrumental variables” (4 comments)
  • Uri Simonsohn’s Small Telescopes (4 comments)
  • Regression and Other Stories translated into Python! (4 comments)
  • StanCon 2020 program is now online! (4 comments)
  • Adjusting for Type M error (4 comments)
  • Inference for coronavirus prevalence by inverting hypothesis tests (4 comments)
  • Embracing Variation and Accepting Uncertainty (my talk this Wed/Tues at a symposium on communicating uncertainty) (4 comments)
  • Validating Bayesian model comparison using fake data (4 comments)
  • “Young Lions: How Jewish Authors Reinvented the American War Novel” (4 comments)
  • My talk Wednesday at the Columbia coronavirus seminar (4 comments)
  • Online Causal Inference Seminar starts next Tues! (4 comments)
  • The Great Society, Reagan’s revolution, and generations of presidential voting (4 comments)
  • Making differential equation models in Stan more computationally efficient via some analytic integration (4 comments)
  • Will decentralised collaboration increase the robustness of scientific findings in biomedical research? Some data and some causal questions. (4 comments)
  • Merlin and me talk on the Bayesian podcast about forecasting the election (3 comments)
  • “Election Forecasting: How We Succeeded Brilliantly, Failed Miserably, or Landed Somewhere in Between” (3 comments)
  • Stan’s Within-Chain Parallelization now available with brms (3 comments)
  • Recently in the sister blog (3 comments)
  • Nooooooooooooo! (3 comments)
  • “Laplace’s Demon: A Seminar Series about Bayesian Machine Learning at Scale” and my answers to their questions (3 comments)
  • StanCon 2020. A 24h Global Event. (More details, new talk deadline: July 1) (3 comments)
  • Making fun of Ted talks (3 comments)
  • “Partially Identified Stan Model of COVID-19 Spread” (3 comments)
  • “A Path Forward for Stan,” from Sean Talts, former director of Stan’s Technical Working Group (3 comments)
  • Recent unpublished papers (3 comments)
  • What up with red state blue state? (3 comments)
  • American Causal Inference May 2020 Austin Texas (3 comments)
  • You don’t need a retina specialist to know which way the wind blows (2 comments)
  • Hiring at all levels at Flatiron Institute’s Center for Computational Mathematics (2 comments)
  • “Statistical Models of Election Outcomes”: My talk this evening at the University of Michigan (2 comments)
  • Korean translation of BDA3! (2 comments)
  • Laplace’s Theories of Cognitive Illusions, Heuristics and Biases (2 comments)
  • “Note sure what the lesson for data analysis quality control is here is here, but interesting to wonder about how that mistake was not caught pre-publication.” (2 comments)
  • covidcp.org: A COVID-19 collaboration platform. (2 comments)
  • How to embrace variation and accept uncertainty in linguistic and psycholinguistic data analysis (2 comments)
  • MRP Conference registration now open! (2 comments)
  • MRP Conference at Columbia April 3rd – April 4th 2020 (2 comments)
  • Some Westlake quotes (2 comments)
  • Covid crowdsourcing (1 comments)
  • Best comics of 2010-2019? (1 comments)
  • My scheduled talks this week (1 comments)
  • We are stat professors with the American Statistical Association, and we’re thrilled to talk to you about the statistics behind voting. Ask us anything! (1 comments)
  • They’re looking for Stan and R programmers, and they’re willing to pay. (1 comments)
  • Postdoc in Bayesian spatiotemporal modeling at Imperial College London! (1 comments)
  • Cmdstan 2.24.1 is released! (1 comments)
  • Some possibly different experiences of being a statistician working with an international collaborative research group like OHDSI. (1 comments)
  • This is your chance to comment on the U.S. government’s review of evidence on the effectiveness of home visiting. Comments are due by 1 Sept. (1 comments)
  • StanCon 2020 registration is live! (1 comments)
  • Update on OHDSI Covid19 Activities. (1 comments)
  • COVID19 Global Forecasting Kaggle (1 comments)
  • Sponsor a Stan postdoc or programmer! (1 comments)
  • Deep learning workflow (1 comments)
  • A normalizing flow by any other name (1 comments)
  • Summer training in statistical sampling at University of Michigan (1 comments)
  • StanCon 2020: August 11-14. Registration now open! (1 comments)
  • Exciting postdoc opening in spatial statistics at Michigan: Coccidioides is coming, and only you can stop it! (1 comments)
  • The Generalizer (1 comments)
  • To all the reviewers we’ve loved before (0 comments)
  • Postdoc at the Polarization and Social Change Lab (0 comments)
  • 2 PhD student positions on Bayesian workflow! With Paul Bürkner! (0 comments)
  • Postdoc in Ann Arbor to work with clinical and cohort studies! (0 comments)
  • Birthday data! (0 comments)
  • epidemia: An R package for Bayesian epidemiological modeling (0 comments)
  • The EpiBayes research group at the University of Michigan has a postdoc opening! (0 comments)
  • StanCon 2020 is on Thursday! (0 comments)
  • Jobzzzzzz! (0 comments)
  • Job opportunity: statistician for carbon credits in agriculture (0 comments)
  • Children’s Variety Seeking in Food Choices (0 comments)
  • Sequential Bayesian Designs for Rapid Learning in COVID-19 Clinical Trials (0 comments)
  • Laplace’s Demon: A Seminar Series about Bayesian Machine Learning at Scale (0 comments)
  • MRP with R and Stan; MRP with Python and Tensorflow (0 comments)
  • Coming in 6 months or so (0 comments)
  • Update: OHDSI COVID-19 study-a-thon. (0 comments)
  • Another Bayesian model of coronavirus progression (0 comments)
  • “Are Relational Inferences from Crowdsourced and Opt-in Samples Generalizable? Comparing Criminal Justice Attitudes in the GSS and Five Online Samples” (0 comments)
  • Rank-normalization, folding, and localization: An improved R-hat for assessing convergence of MCMC (0 comments)
  • The 100-day writing challenge (0 comments)
  • “It’s not just that the emperor has no clothes, it’s more like the emperor has been standing in the public square for fifteen years screaming, I’m naked! I’m naked! Look at me! And the scientific establishment is like, Wow, what a beautiful outfit.” (0 comments)
  • Call for proposals for a State Department project on estimating the prevalence of human trafficking (0 comments)
  • Hey—the New York Times is hiring an election forecaster! (0 comments)

27 thoughts on “Most controversial posts of 2020

  1. I don’t see the use of either the histogram or the scatter plot. Why would comments vary by date and who would care if they did?

    Can’t we do something meaningful with this data? How do Andrew’s blog comment totals affect traffic fatalities? The sex ratio of babies born on a given day? Do they correlate with the number of Trump tweets? How does Andrew’s content impact coronary incidents for conservatives / progressives? Bond buying by the Fed? The time of day of peak windspeed for Atlantic Basin hurricanes?

    Such a rich data source, so many things to learn!

    • Jim:

      I thought it would be fun to list the posts by decreasing order of number of comments. By the way, there’s some good stuff in the posts with very few comments: some posts just supply information and people don’t necessarily react to them online. Anyway, once I had the numbers I thought I might as well plot them. The purpose of the time series plot is not to show trends; it was just to see when during the year the most-discussed posts happen to appear. In this case there was no clear pattern—but that often happens with graphs! I didn’t know what it would look like ahead of time, and I wanted to avoid the selection bias that arises from showing cool graphs but not showing boring graphs.

      • Goodness, sorry I just can’t stop thinking about the possibilities in this data!

        Regression discontinuity analysis! Before/After: COVID, 4th of July, Dem/Rep conventions, election….

        • My comment is my original work. I stand by it. Even if all the data turns out to be wrong or manufactured, it doesn’t affect the conclusions: more male babies are born 9 months after 52-week highs in the NSDQ.

    • Scatterplot is interesting because a decreasing number of comments over time would indicate that perhaps the most recent comments hadn’t reached their potential. An increasing number of comments would show a growing engagement. In this case, we see neither. So, comments probably “max out” after a day or two. And Andrew’s audience is holding steady.

    • I can vouch for Andrew being just one person, and a very productive one…but he is not the only one who writes posts on this blog! As far as number of comments goes, my posts were at positions 3, 7, 9, 11, among others. I don’t post very often, but my posts tend to generate a lot of comments. That said, there is some inflation because I post more follow-up comments on my posts than Andrew does on his; if self-comments were excluded, mine might all drop several places.

      Still, there’s no question that Andrew contributes the vast majority of blog posts (and insights).

      • Phil –

        Sure. I actually was factoring in the (excellent) contributions from the other posters also. Yourself included, and not the least the cats.

        That said, I’m still gobsmacked by Andrew’s productivity – particularly given that he clearly does much in addition to just what shows up directly on the blog.

        • Oh sure, you’re not wrong there. I’ve known him since junior high school and he has pretty much always been like this. Although there was about a year when he wasted a ton of time playing online chess. Didn’t seem to hurt his blog production or publication record, though, so I guess it came out of reading time or something.

  2. I suspect measurement error and demand a recount. Seriously, I posted two comments responding to two other comments, and they all disappeared. I did see someone walking away with my comments when the observers were banished….

  3. Presumably the goal is to have posts that produce zero comments. Such posts will have addressed the issue so thoroughly and effectively that there’s nothing left to say.

  4. But that was data for just one year, and a strange one at that. What about previous years and some sort of time series followed by a forecast for 2021?

  5. Wouldn’t something like log(1+num_comments) be a more interesting plot? I feel like there’s more difference between a 1 comment post and a 10 comment post than between a 200 comment post and a 600 comment post.

  6. Unsurprisingly, like just about every other form of online participation, the number of comments per post is a long tail distribution. For anyone curious about why these happen, here’s some research I’ve done on the topic.

    Steven L. Johnson, Samer Faraj and Srinivas Kudaravalli (2014). “Emergence of Power Laws in Online Communities: The Role of Social Mechanisms and Preferential Attachment“, MIS Quarterly, 38(3), p. 795-808.

    Abstract: Online communities bring together individuals with shared interest in joint action or sustained interaction. Power law distributions of user popularity appear ubiquitous in online communities but their formation mechanisms are not well understood. This study tests for the formation of power law distributions via the mechanisms of preferential attachment, least efforts, direct reciprocity and indirect reciprocity. Preferential attachment, where new entrants favor connections with already popular participants, is the predominant explanation suggested by prior literature. Yet, the attribution of preferential attachment or any other mechanism as a single unitary reason for the emergence of power law distributions runs contrary to the social nature of online communities and does not account for diversity of participants’ motivation. Agent based modeling is used to test if a single social mechanism alone or multiple mechanisms together can generate power law distributions observed in online communities. Data from 28 online communities is used to calibrate, validate, and analyze the simulation. Simulated communication networks are randomly generated according to parameters for each hypothesis. Then the fit of the power law distribution in the model testing subset is compared against the fit for these simulated networks. The major finding is that in contrast to research in more general network settings, neither preferential attachment nor any other single mechanism alone generates a power law distribution. Instead, a blended model of preferential attachment with other social network formation mechanisms was most consistent with power law distributions seen in online communities. This suggests the need to move away from stylized explanations of network emergence that rely on single theories toward more highly socialized and multi-theoretic explanations of community development.

  7. Andrew – lets do an information quality assessment of this.

    Goal: comments indicate traction. The goal could be to raise issues that get high traction.
    X: the data. You present here the blog titles and explained how you got a time stamp.
    f: the analysis: histogram or time by number of comments plot
    U: ???

    Data resolution: except for counting comments, nothing was done to look at the content of the comments. As comments ted to generate other comments you do not seem to have the right data resolution to generate information on what crates traction.

    Data structure: the context for many of the blogs is completely ignored. some addition data on publications or events related to the blogs could prove powerful.

    Data integration: here there was none.

    Temporal relevance: as mentioned. The before and after COVID perspective could prove interesting.

    Chronology of data and goal: the elicitation of comments is time dependent. This aspect was not looked at.

    Generalizability: what can be learned from this as relating to other contexts – no effort was shown in this direction.

    Operationalizability: no idea what to do with this.

    Communication of findings: so so

    Overall, poor information quality, at least in terms of the goal stated above.

    • Ron:

      I was working with what I had, which was the list of posts, dates, and number of comments. You can feel free to scrape the blog off the web and do whatever analysis you’d like!

      • I was actually thinking of giving this as one of the optional projects in my forthcoming course on data analytics. I am also working now on a book titled modern stats with Python and this could make a nice data set for the section on text analytics. In any case, tx for posting this.

  8. I ran into this about 8 days after it was generated. I think it is interesting, but I’d like to have it have more info and links.
    1) I’d like to see the date of the post so that I could see where in your second plot it occurred, thus which of these posts contributed to a peak in the Peak/Date plot.
    2) I’d like to have each post in your list also link to the post. I would both be interested in seeing the data of the posts that were on the list and get an idea as to the “tone” of the comments; it would help if I could click on the item in your list and go directly to the post.

Leave a Reply to Ron Kenett Cancel reply

Your email address will not be published. Required fields are marked *