Matthew Poes writes: I have a question that I think you have answered for me before. There is an argument to be made that HLM should not be performed if a sample is too small (too small level 2 and too small level 1 units). Lot’s of papers written with guidelines on what those should […]

**Bayesian Statistics**category.

## Of multiple comparisons and multilevel models

Kleber Neves writes: I’ve been a long-time reader of your blog, eventually becoming more involved with the “replication crisis” and such (currently, I work with the Brazilian Reproducibility Initiative). Anyway, as I’m now going deeper into statistics, I feel like I still lack some foundational intuitions (I was trained as a half computer scientist/half experimental […]

## Transforming parameters in a simple time-series model; debugging the Jacobian

So. This one is pretty simple. But the general idea could be useful to some of you. So here goes. We were fitting a model with an autocorrelation parameter, rho, which was constrained to be between 0 and 1. The model looks like this: eta_t ~ normal(rho*eta_{t-1}, sigma_res), for t = 2, 3, … T […]

## Data partitioning as an essential element in evaluation of predictive properties of a statistical method

In a discussion of our stacking paper, the point came up that LOO (leave-one-out cross validation) requires a partitioning of data—you can only “leave one out” if you define what “one” is. It is sometimes said that LOO “relies on the data-exchangeability assumption,” but I don’t think that’s quite the right way to put it, […]

## “The Book of Why” by Pearl and Mackenzie

Judea Pearl and Dana Mackenzie sent me a copy of their new book, “The book of why: The new science of cause and effect.” There are some things I don’t like about their book, and I’ll get to that, but I want to start with a central point of theirs with which I agree strongly. […]

## Did she really live 122 years?

Even more famous than “the Japanese dude who won the hot dog eating contest” is “the French lady who lived to be 122 years old.” But did she really? Paul Campos points us to this post, where he writes: Here’s a statistical series, laying out various points along the 100 longest known durations of a […]

## Objective Bayes conference in June

Christian Robert points us to this Objective Bayes Methodology Conference in Warwick, England in June. I’m not a big fan of the term “objective Bayes” (see my paper with Christian Hennig, Beyond subjective and objective in statistics), but the conference itself looks interesting, and there are still a few weeks left for people to submit […]

## “Principles of posterior visualization”

What better way to start the new year than with a discussion of statistical graphics. Mikhail Shubin has this great post from a few years ago on Bayesian visualization. He lists the following principles: Principle 1: Uncertainty should be visualized Principle 2: Visualization of variability ≠ Visualization of uncertainty Principle 3: Equal probability = Equal […]

## “Check yourself before you wreck yourself: Assessing discrete choice models through predictive simulations”

Timothy Brathwaite sends along this wonderfully-titled article (also here, and here’s the replication code), which begins: Typically, discrete choice modelers develop ever-more advanced models and estimation methods. Compared to the impressive progress in model development and estimation, model-checking techniques have lagged behind. Often, choice modelers use only crude methods to assess how well an estimated […]

## What is probability?

This came up in a discussion a few years ago, where people were arguing about the meaning of probability: is it long-run frequency, is it subjective belief, is it betting odds, etc? I wrote: Probability is a mathematical concept. I think Martha Smith’s analogy to points, lines, and arithmetic is a good one. Probabilities are […]

## Exploring model fit by looking at a histogram of a posterior simulation draw of a set of parameters in a hierarchical model

Opher Donchin writes in with a question: We’ve been finding it useful in the lab recently to look at the histogram of samples from the parameter combined across all subjects. We think, but we’re not sure, that this reflects the distribution of that parameter when marginalized across subjects and can be a useful visualization. It […]

## “Do you have any recommendations for useful priors when datasets are small?”

A statistician who works in the pharmaceutical industry writes: I just read your paper (with Dan Simpson and Mike Betancourt) “The Prior Can Often Only Be Understood in the Context of the Likelihood” and I find it refreshing to read that “the practical utility of a prior distribution within a given analysis then depends critically […]

## Prior distributions for covariance matrices

Someone sent me a question regarding the inverse-Wishart prior distribution for covariance matrix, as it is the default in some software he was using. Inverse-Wishart does not make sense for prior distribution; it has problems because the shape and scale are tangled. See this paper, “Visualizing Distributions of Covariance Matrices,” by Tomoki Tokuda, Ben Goodrich, […]

## A parable regarding changing standards on the presentation of statistical evidence

Now, the P-value Sneetches Had tables with stars. The Bayesian Sneetches Had none upon thars. Those stars weren’t so big. They were really so small. You might think such a thing wouldn’t matter at all. But, because they had stars, all the P-value Sneetches Would brag, “We’re the best kind of Sneetch on the Beaches. […]

## Bayes, statistics, and reproducibility: “Many serious problems with statistics in practice arise from Bayesian inference that is not Bayesian enough, or frequentist evaluation that is not frequentist enough, in both cases using replication distributions that do not make scientific sense or do not reflect the actual procedures being performed on the data.”

This is an abstract I wrote for a talk I didn’t end up giving. (The conference conflicted with something else I had to do that week.) But I thought it might interest some of you, so here it is: Bayes, statistics, and reproducibility The two central ideas in the foundations of statistics—Bayesian inference and frequentist […]

## “Economic predictions with big data” using partial pooling

Tom Daula points us to this post, “Economic Predictions with Big Data: The Illusion of Sparsity,” by Domenico Giannone, Michele Lenza, and Giorgio Primiceri, and writes: The paper wants to distinguish between variable selection (sparse models) and shrinkage/regularization (dense models) for forecasting with Big Data. “We then conduct Bayesian inference on these two crucial parameters—model […]

## 2018: How did people actually vote? (The real story, not the exit polls.)

Following up on the post that we linked to last week, here’s Yair’s analysis, using Mister P, of how everyone voted. Like Yair, I think these results are much better than what you’ll see from exit polls, partly because the analysis is more sophisticated (MRP gives you state-by-state estimates in each demographic group), partly because […]

## Watch out for naively (because implicitly based on flat-prior) Bayesian statements based on classical confidence intervals! (Comptroller of the Currency edition)

Laurent Belsie writes: An economist formerly with the Consumer Financial Protection Bureau wrote a paper on whether a move away from forced arbitration would cost credit card companies money. He found that the results are statistically insignificant at the 95 percent (and 90 percent) confidence level. But the Office of the Comptroller of the Currency […]

## My two talks in Austria next week, on two of your favorite topics!

Innsbruck, 7 Nov 2018: The study of American politics as a window into understanding uncertainty in science We begin by discussing recent American elections in the context of political polarization, and we consider similarities and differences with European politics. We then discuss statistical challenges in the measurement of public opinion: inference from opinion polls with […]

## What does it mean to talk about a “1 in 600 year drought”?

Patrick Atwater writes: Curious to your thoughts on a bit of a statistical and philosophical quandary. We often make statements like this drought was a 1 in 400 year event but what do we really mean when we say that? In California for example there was an oft repeated line that the recent historic drought was […]