Learning about correlations using cross-sectional and over-time comparisons between and within countries

Posted on August 16, 2013 9:49 AM by Andrew

Antonio Rinaldi writes:

Here in Italy an “hype” topic is the “staffetta tra generazioni”, handover between generations: since unemployment rate in young people is very high, someone in the government is thinking to encourage older people to anticipate their retirement to make more jobs available for youngs.
I am not an economist and I don’t want to discuss the economic plausibility of such a possible measure.
However, some economists deny its success simply on the basis of a correlation and regression analysis. You can see the charts used as “evidence” here: http://www.lavoce.info/per-una-vera-staffetta-generazionale/ and here: http://noisefromamerika.org/articolo/occupazione-ricambio-intergenerazionale.
The first chart shows that there is no correlation between activity rate in older people and unemployment rate in younger people. The second chart shows that there is a positive linear regression (even if the author doesn’t speak of a casual relationship) between activity rate in older people and activity rate in younger people, while it is expected to be negative if “staffetta” (handover) would work.
I find this use of scatterplot by economists to draw their conclusions very misleading.
First, I am puzzle to carry out a regression analysis giving the same weight to very different states (see for example in the second chart Iceland, 300k inhabitants on the far most to the right, or Luxembourg, 500k inhabitants on the most most to the left, compared to Germany, 80M inhabitants). I remember this post http://statmodeling.stat.columbia.edu/2012/07/08/is-linear-regression-unethical-in-that-it-gives-more-weight-to-cases-that-are-far-from-the-average/ on your blog and I wish to ask: is linear regression ethical when points have very different weight/importance as in the case of nations? Should Germany count as well as Iceland?
But then and most important, from the charts above I don’t see any evidence at all about the failure of a political measure about “staffetta”. I see only evidence that where unemployment is high within olders it is high within youngers, too, and viceversa. In my opinion the only evidence about “staffetta” could be possible if data about unemployment rates would be available _before_ and _after_ such political action in the nations where it had been taken and comparing them. It seems to me that all the rest is pure illusion.

My reply: I agree that there’s a limit to what can be understood from cross-sectional comparisons. One way to get a handle on this is to consider the implicit model under which the cross-sectional comparison can give an estimate of a correlation or causal effect over time.

2 thoughts on “Learning about correlations using cross-sectional and over-time comparisons between and within countries”

Russell Almond on August 16, 2013 10:19 AM at 10:19 am said:

Sandip Sinharay and I did a survey report on this topic a couple of years ago: http://www.ets.org/research/policy_research_reports/publications/report/2012/jgdg

The bottom line came down to an observation that John Willett made a number of years ago: The growth curve of the averages is an unbiased estimate of the average of the growth curves only in the situation where the growth is linear. If you have not linear growth, you need an HLM (multilevel model) to get an unbiased estimate of growth, which pretty much requires longitutidinal data.
jonathan on August 16, 2013 2:31 PM at 2:31 pm said:

Isn’t an implicit model being used by people advancing such tangential relations (meant both conversationally and geometrically)? They’re clearly fitting data to their priors as well as proposing it as an output of their model. I make this comment because you say “the implicit model”, which I take as referring to a method and its limitations rather to the implicit model in use. I imagine that if you developed their model, it might well generate this output, meaning it isn’t just a mistaken or irrational reading, even though that model sucks if compared to a better construction.

I don’t know why I bothered with this comment except the issue sparked thoughts about the relation to choice issues. This is a perfect example of the arbitrariness of choice and the arguments about its appropriateness. It’s funny how the one thing we see around us the most, that events always resolve to choice, is such a deeply conflicted issue.

Comments are closed.