This is a curious statement, because the problem described of N=35 countries vs. N=3 countries in the example is exactly one of spatial autocorrelation. There are multiple ways to deal with that, and I guess it is theoretically possible that you do not have enough variation between groups to get a precise estimate of whatever quantity you are interested in. Nevertheless, as McElreath noted, the spatial distances can be defined on any variety of differences/similarities, including multidimensional similarity.

I am not sure the analogous time series example is exactly analogous, as what is described there is something of a temporal aggregation problem. There is of course a similar modifiable areal unit problem, but this should probably be thought of a distinct problem.

]]>This sounds right to me. This isn’t a problem of drawing overly confident conclusions about the association in the sample, due to having a “misleadingly” high N; if all you care about is the association present in the sample, that confidence is warranted. But of course, you don’t just care about the sample — you’re studying it to understand the a larger population. The real objection here is just that the sample is not representative of the populations (“all existing countries” and/or “all possible countries”) we really want to learn about from a study like this.

Likewise, with those non-causal associations between US states in different regions — the associations are *real*, if perhaps uninteresting, so it isn’t a problem that we can conclude them with more confidence by considering more states per region. If they’re dominated by inter-region trends, then it would be inadvisable to apply them within a region (there could be a Simpson’s paradox situation). But that’s just a matter of knowing what your numbers mean, and what they don’t mean.

It’s not spam if it’s relevant…

]]>My reaction as well. Anthro and evol bio both obsessed with such problems. For handling distance, most common technique I see is some covariance matrix defined by scaled distances (geographic, linguistic, phylogenetic). The Oceanic islands GP model in Chapter 13 of my textbook is a toy example, where distances are geographic.

]]>+1

]]>This was my impression as well.

]]>I agree that when it comes to doing analyses it can be useful to build up to more complexity. But I think it’s always a good idea to flesh out the theory you think best describes what’s going on before proceeding to analyses. It’s important for perspective and to help you identify how various sources of data might help, often there really is more than one or even more than a few relevant sources of data. Without some kind of broad-perspective model you won’t be capable of making good use of the available data, and or identifying the best data collection effort.

]]>My impression is that he was not using the word “Confederacy” to refer to the specific historic event of he secession and the creation of the CSA, but as a shorthand for describing a region of the USA (in the same way he used “New England + West Coast” or “flyover country”)

]]>I think it is more informative to think of these sorts of problems as problems of unobserved confounder variables (here ‘culture’) rather than as small N. This encourages us to think about exactly what cultural variables might be important, how they might work and whether we can measure them. Thinking about these theoretical issues is likely to be more useful than focusing on questions of statistical inference.

I suspect Baron might reply to this statement by saying that thinking about unobserved confounders is about causality – and he is just interested in associations. My response is that if we are interested in whether the observed association will extend to some sort of broader population then these counfounders will influence this inference if they are different in the broader population than in the observed sample. For example, when we include East Asia in the sample.

]]>Probably start small when facing such a highly complex question like this. Instead of an abstract inequality index of some sort, take a few policies implemented in recent decades and do within-country longitudinal analysis (not exactly a small problem). Then maybe go from there and expand to similar countries with somewhat different policies.

]]>In any case, a zero sum outcome measure like the ratio of female/male high performers is not nearly as relevant an outcome measure as say the percentage of girls/boys that are high performers.

]]>There is also the issue of spatial and temporal scales. When we are binning data spatially or temporally we are free to use very small or large bins. The size of the bins seems related to the amount of uncertainty we have in each observation. For example, if we estimate median household income at the state level then we should have much less uncertainty than estimating the value at the county or zip code level. Similarly, household spending each year is less variable than monthly or daily spending. We can consider a trade-off between lots of data points each with high uncertainty or a few data points each with low uncertainty. I think if we can incorporate this uncertainty in each observation into our modeling process then it might in some sense level the playing field.

]]>Gwern:

Yup. A multilevel regression model is still a regression model, and, like any regression model, it can be improved if there is available external information that has not been included in the predictors yet.

]]>Jonathan:

This Michael Porter??

]]>Nice.

]]>Using a MLM strikes me as not being as efficient as possible: we know these similarities decay with distance, whether temporal, physical, or genetic, and we can even construct the phylogenetic trees. Just lumping the countries together in a cluster ignores these differences which can be large or small, and provide purchase for regression on the residuals from what would be expected from their autocorrelation. (Iceland should not be considered as similar to Sweden as, say, Finland.) The autocorrelation should be modeled as directly as possible.

]]>I don’t think there are any good statistical models for how nations form (?), but another way to maybe accommodate spatial autocorrelation could be through gaussian process regression, with covariance matrix informed by pairwise geographic distances (through e.g. waypoints). Statistical Rethinking 13.4.1 has a nice walk through of how to fit this sort of model in Stan.

A Chinese Restaurant model could also be used to average over possible partitions of the countries, if one is uncertain about clustering and unwilling to assume that relatedness between countries is influenced much by their geographic closeness.

]]>“Within each country, societal issues related to gender roles result in varying degrees of economic distinctions between genders, this leads to differences in expectation of life-trajectories, and that leads to teachers having different expectations for child achievement. The result is that teachers teach children different material or spend different amounts of time explaining different things to boys vs girls. In addition there may be some inherent differences in the mean or variance of inherent talent among the populations of boys and girls, and a difference in what each child perceives as valuable and worth spending time on or what is interesting. These feedback effects develop through time as children who achieve at something tend to do more of that thing and less of another thing. The net result is through time as children age there is a widening gender gap in achievement among various topics”

Or whatever, that’s just some stylized idea of what people might think. But suppose it is what you think…Now start encoding that into a mathematical model.

mathematically we have several different academic/school related topics, perhaps math, language, sports, music, etc. We have in each country some attitude about whether each topic is “more male” or “more female”, we have associated effort and encouragement by society for each gender, we have children’s perceived gender role and level of interest, we have etc etc etc. These are the parameters we use to describe the process.

Next we need the process description:

rate of improvement in skills related to topic X for each child is functionally related to the inputs that go into skill development, including encouragement, individual child interest, time spent by the child, availability of instruction in the topic… And country or societal level parameters determine some of the encouragement, and some of the availability…. and society level parameters are similar within groups of countries…. and then across the world different groups of countries have some similarities as well…

In the end you’ll have a large model for worldwide educational variation across multiple topics in which there are thousands of parameters you are uncertain about. This is *the reality* of the problem. Now, because of these thousands of parameters, you’ll want to look for sources of data which can inform the quantities of interest: surveys of children’s interest, datasets on teacher populations: age, gender, subject they teach.. Data on spending in each country, time-series data tracking individual children’s achievement, time series data across different eras… whatever, each source of data is something you can potentially use to constrain the parameters within a given country, and thereby also constrain parameters within neighboring/similar countries, and thereby constrain parameters within continents… etc etc. But data won’t be uniformly available in all locations for all topics. So you’re going to have to work to provide reasonably well thought out priors for your parameters.

Next you’ll say: gee that’s all well and good, but I don’t have any of that information right now, and I do have this one great dataset with 18 data points in it across 3 countries, how can I make progress so that I can get grants and tenure? That’s a huge hard problem you’ve just described, I’d much rather just grab some dataset and calculate some p values…

And now we know why so little progress is made, because we’ve *institutionalized* non-science as if it were in fact the pinnacle of scientific achievement: knowledge from pretending everything comes out of random number generators and reified the idea that without really thinking about how things work very much we can just grab some small datasets and pretend that the data comes out of random number generators, and check to see if we can mathematically detect differences between RNG A and RNG B.

]]>In terms of model division, I saw an op-ed by Michael Porter yesterday that cited a ‘rigorous’ social index. Is such a thing possible? I’d say no but when he cherry picks murder rate, then I know that’s a bullshit op-ed. People treat these issues as though they’re epidemiology, as though ‘determining’ penetration rates in specific populations of virulent diseases with relatively known infection rates – depending on exposure modeling, etc. – is the same thing as taking something buried way below the surface, like some measure of genetic diversity, and applying that to a population. You can see a rough connection: some groups are perhaps more prone to certain infections given certain other factors, like the way it appears HIV spreads in Africa depending on rates of already existing infections (meaning some of the research shows a form of opportunism). But in general? Treading on Wansink territory.

]]>