agreed that what I wrote may not be *generally* true either (need to think about that). But maybe you want to qualify what you wrote in yourblog post, Andrew, because otherwise people will cite your claim as fact (“Gelman said so”).

As for choice of parameterization, always use the sum contrast parameterization as it centers the predictor (Gelman and Hill 2007). treatment contrasts are almost always irrelevant for my research questions anyway. See: https://www.sciencedirect.com/science/article/pii/S0749596X19300695?via%3Dihub

]]>Zee, et al. present a method for checking the plausibility that the standard deviation of an effect could be better modeled by including one or more variables measured but not initially considered theoretically relevant. This seems a lot like the practice of checking the empirical ICC between clusters before deciding whether to include it in the model (e.g., we may model an effect with between-classrooms ICC but leave out between-schools ICC), which is consistent with best practices. It also seems a lot like stepwise regression, which I think has gotten a bad rap due to the our pursuit of a parsimonious, “true” model and our fear of overfitting–as well as abuses like failing to report the steps and reliance on p-values to decide what stays in and what gets booted. In other words, their method should be standard practice for descriptive regression analysis and reporting, particularly when the theory is vague and the predictors are many.

My one criticism of their work lies in their call to “distinguish true causal effect heterogeneity from spurious sources operating at the subject level such as sampling error or measurement error.” The term “spurious” here implies that these forms of error actually are random or ignorable, unlike heterogeneity due to individual differences among study participants. This is only true of sampling error when participant inclusion in the sample has been determined by a truly random selection process, and it may never be true of measurement error. Much of what we call measurement error can be explained by differential item functioning (DIF), the unintended interaction of participant characteristics (other than the target construct) with measure characteristics. Even what the authors call “treatment error” (and I take to mean treatment infidelity) generally covaries with participant characteristics like engagement and experience. Our use of a random error term to represent these sources of variability is just as much of a useful fiction, and potentially just as much as a missed opportunity, as conceptualizing individual heterogeneity as truly random. As a practical matter, it’s fine (and ultimately necessary) to draw the line at modeling individual differences but not DIF, or DIF but not infidelity, so long as they recognize the arbitrariness of our choice. But wherever we draw that line it is still a useful fiction dictated by our design and measures. To coin a phrase, it’s model assumptions all the way down!

]]>Indeed, as discussed here, it all depends on the relative size of the interactions and the main effects. To say “an interaction is just a main effect seen differently” isn’t generally correct, because the choice of parameterization itself contains information. Main effects are “main effects” for a reason; this is how the models get set up. Again, it depends on the example.

]]>It looked to me like at least in the simplest simulations, either you generated a particle of the dominant type, in which case the count of that type increased, or you generated a particle of the non-dominant type, in which case it annihilated another particle and the dominant type count decreased.

Yes, but this is not a consequence of that:

net particle generation rate declines with time

One species is dominant most of the time, but it will switch back and forth between them if you wait long enough. So that is an additional assumption you are making then.

]]>It looked to me like at least in the simplest simulations, either you generated a particle of the dominant type, in which case the count of that type increased, or you generated a particle of the non-dominant type, in which case it annihilated another particle and the dominant type count decreased.

If you think of + as matter and – as antimatter… basically you either added +1 or -1 to the “total count”.

The sum of a large number if IID random variables is a normal random variable, so basically if you skip a few hundred of your timesteps you can already guess that the result of that larger timestep will be a normal random increment.

If we aggregate a few hundred timesteps of your particle by particle simulation, we can simulate the net effect by just doing normal random numbers… so you can think of my simulation as automatically simulating several hundred times as many simulations as you did per each of my timestep…

]]>As far as I can tell, this is not generally true:

https://vasishth-statistics.blogspot.com/2018/04/the-relationship-between-paired-t-tests.html

In this case, an interaction is just a main effect seen differently so it is nothing special; unless i made a mistake somewhere. I think Jake Westfall might have mentioned this at some point too.

]]>the observations were averages.

Actually, averages of averages. And I’ve even seen an additional layer of averages.

Say you measure a number of cells from the same animal at one timepoint. Take the average. Do the same for multiple timepoints in the same animal, then average all those averages. Do the same thing for multiple animals in the same “group”, and get the average of all those averaged averages.

]]>so you can simplify your model as

I don’t really see how you derived this from the simulations but ok. I think you are making the same argument I did that if the universe became less dense over time a certain favored species of particle would get “frozen in” as the majority. In another comment I speculated:

If “black hole baryogenesis” was most common soon after the big bang then I would expect the results to be uniform today. Basically I am imagining that the rate of baryogenesis is much lower today than in the past because there are fewer black holes (because the universe has become less dense).

However, you do not require that assumption to get the same outcome. Even with constant rate of baryogenesis, the simulations show you still get a dominant species.

Pretty much any way I look at it I do not expect equal amounts of matter/anti-matter, despite them being symmetrically generated. It just doesn’t seem to be any great mystery when you look at individual outcomes instead of the average.

To bring it more on topic, I think this confusion between average and individual instances is going to be looked back on as a very common error in our day. Spanning pretty much all fields of research.

I remember looking at a dynamite plot (mean +/- sd) from a colleague of something like western blot results (some kind of bio assay) and asking why the sd was so much larger in one group vs the other. Was there a high outlier, a low outlier, or what? They had no idea what the underlying data even looked like.

All their “models” (in the loose sense used by biomed researchers) were about what happened at the level of an individual cell, but the observations were averages. They never even look at the individual points… unless of course they want to “drop outliers” to “get significance”. While this is not *always* a problem, it should not be assumed.

]]>Causal processes in Psychology are Heterogenous… meaning we should have models in which individuals undergo various fluctuations in their behavior… an agent based model for example…

But if we want to determine whether that model “makes sense” we *can* do things like compare the average of many selected agent predictions to the average of many observed quantities… If our model for example predicts that rates of violence should increase with time, but we show that rates of violence decrease with time… we’ve made a bad assumption about the rules of the agents.

we should not, however, compare one single agent to say the average of 100 people’s surveys… That one single agent has say a lot of violent behavior doesn’t invalidate a model. There are plenty of people incarcerated for violence for example.

]]>A potentially realistic assumption is that net particle generation rate declines with time, a simple version of this assumption is that say sigma = sigma0*exp(-t/Tc) where Tc is just some characteristic time for exponential decay…

For more complex models look at the edit 4 where annihilation rates are a function of the number of particles (so more than one particle can be removed each step). Chris suggested this model thinking it would show a tendency to near equal matter-antimatter numbers, but it didn’t. Then the argument switched to “not enough black holes”, etc. That is why I say it doesn’t seem he understood that “the probability to be within epsilon of 0 decreases to zero with time.”

And via the hawking radiation mechanism proposed in that link I don’t see why the generation rate must decrease with time, but of course it could… In fact, in the simulation “time” is determined by each particle-pair that gets generated (and possibly immediately annihilated).

fix a small region around 0, like 0 +- 1… since this small region is fixed, but the standard deviation grows like sqrt(t), the normal distribution is spreading out… and the density at 0 is decreasing towards 0 because of the normalization constant in the normal curve which is 1/sqrt(2*pi)/sigma/sqrt(t)… but the probability to be in this region is approximately p(0)*2 where p(0) is the density of the normal(0,sigma*sqrt(t)) function… so it’s decreasing to 0.

This makes sense to me.

]]>library(ggplot2)

sigma = 100*exp(-(1:10000)/1000)

rnd = rnorm(10000)

N = cumsum(rnd*sigma)

qplot(1:10000,N,geom=”line”)

so you can simplify your model as

N(t + dt) = N(t) + normal(0,sigma0*exp(-t/Tc)*sqrt(dt))

for simplicity let dt=1, let sigma0 = say 100, and Tc=1000, run the model for 10000 steps… my guess is you’ll see a lot of early expansion, and then it basically becomes constant… like a “big bang” ;-)

]]>https://en.wikipedia.org/wiki/Wiener_process

The other link is for the physical process of motion of particles.

]]>the 0 comes about because at time t=0 there are no particles yet by assumption, and from the Martingale property (that the conditional expected value for the future is the current value). The sigma*sqrt(t) comes about from the linearity of the variance of the sum of random variables.

So at t=0 what we know for large t is that at that future time our knowledge of that future universe is that it’s a random normal(0,sigma*sqrt(t)) random variable.

fix a small region around 0, like 0 +- 1… since this small region is fixed, but the standard deviation grows like sqrt(t), the normal distribution is spreading out… and the density at 0 is decreasing towards 0 because of the normalization constant in the normal curve which is 1/sqrt(2*pi)/sigma/sqrt(t)… but the probability to be in this region is approximately p(0)*2 where p(0) is the density of the normal(0,sigma*sqrt(t)) function… so it’s decreasing to 0.

]]>https://ccl.northwestern.edu/netlogo/

and modeling a process involving say 4 particles: protons and antiprotons, electrons and positrons… Create some “black holes” randomly located in space, at each time tick generate a large number of particle pairs at random locations on the grid… give them a very high kinetic energy and conserve momentum (send them in opposite directions) use simple newtonian gravitation and electrodynamics… it’s not going to be cheap computation but you could get a sense of how things work… might be fun.

]]>]]>the probability to be within epsilon of 0 decreases to zero with time.

There might also be some kind of feedback mechanism involving changing properties of the black holes… in any case I don’t want to get into it too deep here, since it’s not really on topic, but you can read up on the mathematical concept of brownian motion starting here: https://en.wikipedia.org/wiki/Brownian_motion

]]>This property of the current value being the expectation for the future is called being a Martingale, the property of having the future determined only by the current value and not additional info from the entire past history is called being Markovian (or a Markov process).

]]>The long term distribution is a gaussian whose variance increases linearly with time. So the probability to be within epsilon of 0 decreases to zero with time.

This would explain it. Is there a term for this? From my conversation with Chris (who seemed to be somewhat of an expert on random walks), he seemed unaware of such a law.

]]>The long term distribution is a gaussian whose variance increases linearly with time. So the probability to be within epsilon of 0 decreases to zero with time. I think Chris in your example was pointing out that this process *by itself* doesn’t produce enough particles fast enough, but obviously this process is only a simple process designed as basically an asymptotic “inner” description (when particles are dense). So what this really requires is sufficient numbers of black holes in the early universe.

Anyway, the point you’re making about comparing process paths to process paths and averages to averages is correct. Averages are often very different looking from individuals. For example a brownian motion path is nowhere differentiable, but the average is a constant.

]]>This was another issue I found in physics btw. They model an average universe, and compare it to what we observe (one instance of a universe). Thus they have the wrong intuition about what should happen: https://physics.stackexchange.com/questions/505662/why-is-matter-antimatter-asymmetry-surprising-if-asymmetry-can-be-generated-by

I suspect this is a common confusion because it won’t get caught by dimensional analysis. As I say in one of the comments there:

]]>As I have said a couple times now: We do not live in the average of many universes, we observe one instance of a universe. So it is incorrect to compare what is predicted on average to our observations. It is unsurprising to me that the average result is not observed in reality. I think it is confusing because the calculations for the average have no logical flaw and have the same units as the individual instances, but they mean different things.

I don’t think that “sometimes it helps” and “sometimes it doesn’t”

]]>Instead, we should look for relatively consistent explanations in terms of things that reliably produce some effects, and allow some model error, but require in aggregate that some statistics of the predictions be rather consistent with statistics of the observed.

It’s still a model of the individual instances, but our decision about whether the model is working is based on comparing the aggregated predicted effects across multiple prediction instances to the aggregated observed outcomes across many observed experiments.

This is, after all, the structure of for example the Lattice Boltzmann model of Fluid Mechanics and I don’t think anyone is going to claim that it’s not a good theory.

]]>“Once you think about it, it’s hard to imagine any nonzero treatment effects that don’t vary.”

Why is the qualifier “nonzero” necessary? Isn’t it just as true to say:

“it’s hard to imagine any treatment effects that don’t vary.”

The average person, rat, cell does not exist so studying that could be very misleading: https://statmodeling.stat.columbia.edu/2015/10/05/cognitive-skills-rising-and-falling/#comment-245800

]]>My only real concern is summed up in this paragraph from the intro: “Because our primary concern is with theory formulation and testing, we advocate models that are adequate for this task, that are readily available, and that are easy to use. Some causal processes in experimental work will no doubt require the more sophisticated tools that are now becoming available, but in our view, the bulk of the benefits can be obtained using simpler approaches.”

I certainly appreciate why they focus on accessibility in order to communicate their message more broadly, but I think a focus on “readily available” and “easy to use” models is exactly why some parts of psychology are so theoretically barren (this is the story of ANOVA after all, which they even recount in the paper). As unpleasant as it may be, I think the best way forward is to accept that more “sophisticated tools” (by which I mean more complex models) are what you really need to formulate and advance theories that have any substantial value.

]]>