An incredibly useful method is to fit a statistical model repeatedly on several different datasets and then display all these estimates together. For example, running a regression on data on each of 50 states (see here as discussed here), or running a regression on data for several years and plotting the estimated coefficients over time.

Here’s another example:

The idea is to fit a separate model for each year, or whatever, and then to look at all these estimates together to see trends. This can be considered as an approximation to multilevel modeling, with the partial pooling done by eye on the graphs rather than using a full statistical model.

One reason the secret weapon is so great can be seen in various analyses of repeated cross-sectional data, with estimates every two or four years (for example, in studying Congressional or Presidential elections). The horrible alternative often involves people pooling data over decades in order to get stable estimates, but as a result it is then difficult to see time trends, and models get oversimplified.

We call it this technique the “secret weapon” because it seems to be done much less often than it could be. I suspect the technique is not used more because people are fixated on point estimates and don’t realize that a graph can tell a clearer story. Another failure of classical statistical estimation!

For some examples of the secret weapon with repeated cross-sectional data, see Figures 2, 4, 9, and 10 of this paper.

Well, I guess it’s not a secret anymore…

That's a very nice set of examples! How should this be called? How about "drift" or "parameter drift"? One can model the parameter drift by going to a higher level.

I had some interesting examples of parameter drift in a medical application: in the second year of the study, they begun using the results of the model of the previous year – many parameters changed. A colleague noticed drift in a different study, and it turned out that a different doctor was now recording the symptoms. I agree very much that this tool is totally underutilized.

I agree. And I suspect that this weapon will be even more powerful at informing modelling choices when it comes to spatial data or datasets with even more complicated dependencies. For example, I'm thinking about those IR datasets with "directed dyads" where geographic and temporal contiguity seem to matter in models of war or trade.

I also liked the way that the lagged regression fits so nicely into the mlm framework in this paper