## Here’s a puzzle: Why did the U.S. doctor tell me to drink more wine and the French doctor tell me to drink less?

This recent post [link fixed], on the health effects of drinking a glass of wine a day, reminds me of a story:

Several years ago my cardiologist in the U.S. recommended that I drink a glass of red wine a day for health reasons. I’m not a big drinker—probably I average something less than 100 glasses of wine a year—but when I remember, I’ll drink a glass or two with dinner when it’s available. I don’t love the taste of wine, but some of it is OK, and I already preferred the taste of the red to the white, so at least that worked out.

Anyway, awhile after receiving this recommendation, I spent a year in France, and I had to see a doctor and get a physical exam there as a condition of my work permit. The doctor asked me a bunch of questions (and spoke slowly enough that I could converse), including how much did I drink? I said I drink a glass of red wine a day on the recommendation of my cardiologist (ummm, I probably said “the doctor for my heart” or something like that). The French doctor replied that I should stop drinking as it’s bad for my foie.

I was taken aback. The U.S. doctor (not originally from this country, but still) said I should drink more; the French doctor said I should drink less. Some of this could be attributed to specialization: the cardiologist focuses on the heart, while the general practitioner thinks in terms of total risk. But, still, it was a surprise: I’d think it would be the French doctor who’d be more positive about the effects of a daily drink of wine.

OK, this is N=2 so we can learn almost nothing from this story: it could just be that this French doctor hates alcohol for other reasons, for example. Nonetheless, I’ll spin a statistical tale that goes like this:

Both doctors are being Bayesian. The U.S. doctor, upon hearing from me that I drank occasionally but rarely, inferred that: (a) I don’t drink a lot, and (b) I could drink a bit more without worry that I’d start to drink heavily. In contrast, all the French doctor heard was that I drink a glass of wine daily. From this she could’ve inferred that I might be a heavy drinker already and just not admitting it, or that if I was given any encouragement, I might drink to excess. In addition to all that, alcohol consumption is higher in France than the U.S., so the French doctor is probably used to telling her patients to drink less.

What did I actually do, you might ask? I split the difference. I continued to drink red wine but didn’t make such an effort to drink it every day.

1. Seb says:

Life expectancy is quite a bit higher in France then it is in the US. I will jump to the conclusion that this must be due to the consumption of red wine.

2. I would be quite interested in exploring what effects sulfites in wine have. I am apt to buy organic wine which is claimed to have ‘naturally’ occurring sulfites or negligible amount.

3. D Kane says:

Given everything you have learned/written in the last decade about problems with published research, why would you think there was any value in such advice?

• yyw says:

^This. I think there are conflicting studies on association of drinking wines with heart health. As far as I know, they were all observational studies.

• Yes I think they are all observational studies and given it is so hard to learn about medical interventions even when studied with randomized trials (given how the research is actually done and published in journals) – it will be decades before the issue will have high quality credible evidence one way or the other.

David Spiegelhalter fairly recently had a go at some observational studies and this was his take on it https://en.wikipedia.org/wiki/Microlife He also cautions there is a lot of uncertainty around it.

4. Matt Skaggs says:

The link at the beginning does not work.

5. jim says:

Too funny.

My mother has had both hips replaced and All of her siblings and her father have had hip replacements. So when I was a student and had no med insurance, whenever I met doctor I mentioned this and asked what I could do now to keep delay problems. Several doctors said “don’t use your hips too much”, which even I knew then was bad advice, and probably would make the condition worse, not better.

My general experience is that the expertise of most experts is *highly* overrated.

• Martha (Smith) says:

Jim said, “My general experience is that the expertise of most experts is *highly* overrated.”

I think part of the problem is that we often think of people (e.g., physicians) as “experts” when they really aren’t. I have become rather skeptical of things that physicians (especially “generalists”) say and do, especially when their examination and “diagnosis” seem cursory. So, for example, I often view what my primary care physician says with skepticism, but when I had a shoulder problem, and he had me lie on my back on the exam table, stood behind me, and pressed several places around my shoulder, and said “It’s the pectorals minor tendon” when he got to the place that hurt, he had some credibility.

• Terry says:

I share your general skepticism of physicians. I have also found that the same exception you have: MECHANICAL problems like injured tendons, torn muscles, bone spurs, broken/fractured bones, pinched nerves, arthritis, etc. I have had some seemingly miraculous cures after a doctor or physical therapist literally pointed to what was wrong. “Yeah, this nerve right here where it runs over your shoulder joint is pinched. It is because you twist to the left as you sit in your chair at the computer. Sit up straight and it will go away.”

• Martha (Smith) says:

Physical therapists generally have more credibility with me than do physicians (at least fo the types of things they treat) — they seem to have much better knowledge of the mechanics of using body parts, and aren’t lured by medication or surgery as alternatives to hands-on approaches.

• Kyle C says:

This relates to something I have noticed recently, and I wonder whether it is a topic in the medical literature. Evidence (tentatively!) suggests that there are some domains in medicine where “reversing the cause” of a problem helps, and some where, surprisingly, “reversing the cause” does not necessarily help. Two examples of the latter (based on current evidence) seem to be obesity and cholesterol. Obesity is (loosely?) associated with certain health problems and serum cholesterol is (-??) associated with heart problems. But it has not been established that making an obese person non-obese fixes their health problems, or that reducing someone’s high cholesterol makes them less likely to have heart trouble. “Common sense” might seem to suggest this — but so what? I’m skeptical.

• jim says:

I don’t know about these specific circumstances but in general there’s nothing that says a human biochemical process must be continuous or linear or reversible. I don’t know a particular one right off hand but it’s conceivable that certain conditions throw a genetic switch and change the way the body responds to that condition. So accumulating weight might cause problems that don’t reverse when one loses weight. Certainly, problems caused by drinking don’t necessarily reverse when one stops drinking and same for smoking.

• jim says:

Yes, I agree, mechanical things they seem to have a handle on. Stuff that’s been known since before their med school professors went to med school.

6. Shecky R says:

Any chance that recommendations (based on new research) had simply changed by the time you went to France?… For years docs recommended a baby aspirin per day for those of a given age with certain family history; but now it’s gotten much more complicated, and the aspirin recommendation suddenly only applies under certain conditions… more generally, the various risks from long-term daily aspirin use are seen as greater than the cardiac (and perhaps other) risks they’re meant to protect against.

• Anoneuoid says:

When I was born, doctors believed babies couldn’t feel pain and were doing stuff like heart and brain surgery on them without anesthesia. It happened to my friend.

• Martha (Smith) says:

Aargh!

• Terry says:

“doctors believed babies couldn’t feel pain”

Say what?

I’ve seen a very young baby get stuck with a pin, and that baby sure as %@&* acted like it could feel pain.

Looking up Pain in Babies on Wikipedia, though, confirms what you say.

Holy Moly.

• Martha (Smith) says:

“The cry response is increasingly important, as researchers are now able to differentiate between different kinds of cry: classed as “hungry”, “angry”, and “fearful or in pain”.”

Duh — it’s been common knowledge for ages that many mothers quickly become adept at identifying the causes of different kinds of crying in their babies.

• Anoneuoid says:

Holy Moly.

Yep. It is a great of example of:

1) people believing some incorrect thing
2) using NHST to study it for decades
3) continuing to believe the same incorrect thing until some outside force made them think differently

All our current NHST-based system does is measure and then reinforce the collective prior beliefs.

• Matt says:

Every data point fits anoneuid’s theory of the dangers of NHST! Awesome! Makes me wonder why he hasn’t won a Nobel prize in multiple fields… presumably it’s because the selection committee is still stuck thinking about science incorrectly.

• Anoneuoid says:

Do you have a counter example? Where people learned something new that changed their minds via NHST rather than the old idea/fad just fading away?

Once again, all this has already been pointed out decades but ignored by people like Matt who would prefer we go on wasting vast amounts of resources on unproductive pseudoscience:

Perhaps the easiest way to convince yourself is by scanning the literature of soft psychology over the last 30 years and noticing what happens to theories. Most of them suffer the fate that General MacArthur ascribed to old generals—They never die, they just slowly fade away. In the developed sciences, theories tend either to become widely accepted and built into the larger edifice of well-tested human knowledge or else they suffer destruction in the face of recalcitrant facts and are abandoned, perhaps regretfully as a “nice try.” But in fields like personology and social psychology, this seems not to happen. There is a period of enthusiasm about a new theory, a period of attempted application to several fact domains, a period of disillusionment as the negative data come in, a growing bafflement about inconsistent and unreplicable empirical results, multiple resort to ad hoc excuses, and then finally people just sort of lose interest in the thing and pursue other endeavors.

Paul E. Meehl (1978) Theoretical Risks and Tabular Asterisks: Sir Karl, Sir Ronald, and the Slow Progress of Soft Psychology Journal of Consulting and Clinical Psychology, Vol. 46, pp. 806–834. citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.200.7648&rep=rep1&type=pdf

• matt says:

First, when did soft psychology become the gold standard of research? Congratulations on criticizing a field where the average researcher has taken 3 introductory statistics courses. My issue is with you extending this criticism to any and all fields that use p-values.

What is NHST? In economics (again, my discipline) people use p-values, this is true. However it’s not like all anybody cares about is whether a result is “significant or not”, nor is anyone seriously testing any straw-man null hypotheses. P-values are reported alongside the estimate and the standard error; all three of which are of interest to the researcher / audience! I don’t know why you keep propagating this myth that nobody in social science cares about effect sizes / sign, they just care about “rejecting the null”. I can honestly say I haven’t heard that phrase, “reject the null”, uttered in an economics seminar in my lifetime.

I already gave you an example with the Mariel Boatlift, which seems like a useful paper, and you proceeded to criticize it on grounds other than using the NHST paradigm (e.g. data quality (which is hilarious by the way.. given the paper is using government data on wages – not sure how you improve the data quality)). Go look up anything written by Josh Angrist; he is a reduced-form guy who writes simple, carefully-done papers that often have policy-relevant results that are actionable.

Your “criticism” that social science needs to use theory that has substantive predictions is well-known by any serious researcher. There is a lot of theory in economics, obviously. What you don’t understand is that “deriving theories from first-principles” isn’t so nice in the social sciences. Your little growth rate example is great as a toy example; but what do you do when you keep deriving theories from first principles that fit the data like crap? The idea of having theories that predict different sign/sizes for effects in different contexts is great in theory, but not so useful in practice in my experience. In any case, I always end up responding to you by talking about reduced-form versus structural work; but that’s not even the real argument here.

Anyways, as usual I assume you won’t respond to the content of this comment, rather you’ll just go off on a slightly related tangent that ends up with you crying “ARRRGGGHHH NHST!!! Everyone is so dumb other than me! (Yay for that!)” in a more lettered way.

• I wonder if the ‘growth mindset’ theory [if a theory] will into disillusionment.

IT is in the interest of specialties to generate content. In fact, that has been mandatory for nearly every institution and domain. In that process, some amount of content is going to fall by the wayside or disappear. And some of it is useless.

I like that analogy that is drawn to the Library of Alexandra to the current data environment whereby the latter disappears after 5 years or so. The Library of Alexandria was destroyed purposefully. What is behind the disappearance of data of the last 20 years, say, is not clear to me. Is the disappearance by design, inadvertence, etc.

• Anoneuoid says:

I already gave you an example with the Mariel Boatlift, which seems like a useful paper, and you proceeded to criticize it on grounds other than using the NHST paradigm (e.g. data quality (which is hilarious by the way.. given the paper is using government data on wages – not sure how you improve the data quality))

That data was garbage (as admitted by the original author), the conclusions didn’t change anyone’s mind, and there was no NHST in that paper! Where do they check for exactly zero difference in employment rate or wages?

So if that is the best you can do as an example of successful NHST… I don’t know what to say.

Your little growth rate example is great as a toy example; but what do you do when you keep deriving theories from first principles that fit the data like crap?

That example was to show the difference in what the parameters mean between a rationally derived theory and an arbitrary statistical model. It had nothing to do with this. You clearly missed the point entirely.

Also, as I said in the other thread: the first thing is to make sure your data isn’t unreliable crap to begin with. Economists apparently need a Tycho Brahe to put forth the effort to collect reliable data they can hang their hats on. I doubt they will make much progress until then. Messing around with garbage data (even if that is “the best available”) and analyzing it with NHST is at best a waste of everyone’s time. At worst it will lead to advising central banks and governments to do the wrong thing until we end up with nonsense like negative interest rate mortgages.

There are some efforts along these lines in the biomed realm. Eg, Project Tycho (https://www.tycho.pitt.edu/), and SEER (https://seer.cancer.gov/). They are hardly perfect but are a start.

Anyway, so then you have no counter example?

• >not sure how you improve the data quality

Well, maybe you can’t but it doesn’t mean the data quality is sufficient to answer the question. If a major component of the region’s income is black market illicit drugs (and it was) then pretending that your big government data is meaningful will just lead you astray.

> My issue is with you extending this criticism to any and all fields that use p-values.

Anoneuoid has repeatedly said that using a p value to test a specific theoretically motivated hypothesis can be a reasonable choice.

He criticizes bio-medicine, a field where people ignored the very obvious fact that babies feel pain because it was convenient for them to believe that it wasn’t true, and a field where NHST *is* rampant, and a field where Anoneuoid used to do research, and you attack him for your imagined versions of his opinions on the field of economics? Can you find anywhere on this blog where Anoneuoid has made specific opinions on economics other than in response to someone asking him about econ papers etc?

• matt says:

Daniel, please. Anoneuid makes wide-sweeping and vague criticisms of science in general. Just above he challenged me to find one example of NHST where it provided a useful result. He didn’t specify a field, and rarely does.

I just think it’s very easy to say “oh, why don’t you just build a model that has substantial theoretical predictions and then verify whether they hold in the data?”. This is really hard to do in a single research project – I’m not saying research should be easy, but at the same time, research needs to be conducted within a reasonable timeframe given the nature of career progression in academia.

I do think if you look at some social science literatures on the whole, this process does kind of happen. A paper will propose some theory that is likely overfit to the data specific to their project; but, then other papers will test this theory on new data and revise it; and this process repeats itself. And yes, a lot of these papers would still be classified as “NHST”, I suppose, but in the aggregate they are doing useful things. But if you think there are theories out there that will predict macroeconomic trends the way that growth formulas predict cell growth, you are simply delusional and haven’t felt the pain of working with economic data and theories.

• Anoneuoid says:

. Just above he challenged me to find one example of NHST where it provided a useful result. He didn’t specify a field, and rarely does.

Right, because it doesn’t matter. A logical fallacy is a logical fallacy no matter who is doing it.

You can find me criticizing LIGO for their NHST usage (basically they put far too much emphasis on rejecting the background model) on this blog too. I don’t know if it is going to show up but Andrew said he was going to post something in Sept about galactic rotation curves I sent him awhile back too.

And, my other post got held up due to the links so perhaps you didn’t read it. But your one example of a supposed successful use of NHST was:

1) not successful
2) not NHST

• I’m not sure anything Angrist does is what I’d classify as NHST. You seem to miss the point by misunderstanding the difference between say NHST and classical estimation theory.

Actual NHST: compare results of measurement to some arbitrary “null” random number generator, if you can reject the idea that this random number generator produced the data, immediately assume the truth of your favorite alternative theory. For example Genome Wide Association Studies for the involvement of certain genetic abnormalities in certain diseases. Or studies of acupuncture vs drug therapy for allergies, or non-inferiority “testing” between drugs, or various comparisons between educational policies, or nutrition studies (Wansink anyone?)… It’s literally rampant, though typically not what you find in Econ.

That actual NHST methodology is a straight up fallacy, and yet it occurs all the time in some fields. You can’t do useful science in *any* field using that paradigm. It’s like asking “how could it be the case that assuming 1=0 is NEVER going to lead to valid science?” It just isn’t, doesn’t matter the field or what follow-on assumptions you use. You have to start with logically valid ideas before you can do science.

Classical Estimation: Make some assumptions about your data generation and collection process that allows you to prove that under the assumptions some particular statistic of the data F(Data) is close to some underlying parameter that controls the data generation process to within a controlled error magnitude. Refuse to use any prior information about the underlying parameter, refuse to use any structural connections between one set of observations and another set of observations that might constrain your parameter estimates (hierarchical models) basically refuse to do any “regularization” that might introduce “bias”.

Classical Estimation isn’t NHST, but it’s almost everything being done by your example Economist Angrist as far as I can tell (I looked at a few of his papers in Education analysis related to Charter schools for example).

It’s not invalid, it’s just weak sauce and not really a full serving of science. Specifically, it’s a kind of measurement, it’s like building a better Oscilloscope by arranging for some special circuit to cancel out a particular kind of noise or something.

Measurement is useful, important, critically important even, but it’s just one part of science. Econ has failed to have a lot of success in structural modeling or theoretical anything. You’ve said so yourself. I have particular ideas about why that would be and a lot of it has to do with the lack of acknowledgement of dynamical processes IMHO. But one answer by a subset of economists has been to give up on theory, and simply go out and try to measure a lot of examples using tricky “identification strategies” and “unbiased estimators” that are un-regularized and assume very weak data generation process models.

It’s not my impression that Anoneuoid has much to say about Econ, except in so far as he apparently makes a living putting his money where his mouth is and trading in the finance markets on the basis of his own ideas of how data should be analyzed. That he isn’t bankrupt indicates something.

Is it OK to give up on a “science” of economics and just be satisfied with a series of “in the aftermath of Katrina, the conversion of New Orleans schools to charters resulted in some overall average improvements in educational outcomes for the poorest regions” (Angrist?) or “giving children toothbrushes in elementary schools in Orange County CA in the period 2000-2010 reduced cavities by 10% vs what they would have been” (a made up example) or whatever?

Sure, sometimes it’s hard to even just measure what happened and to the extent we can, that can be good. But I don’t think wandering around on the street with a caliper looking for things to measure is what science is about. That Economists seem to think this is the height of their modern achievement is basically an indictment of the whole field, not a cause for celebration. Your example with the Mariel boatlift had several follow-up papers that came to entirely different conclusions. So apparently we don’t even have a satisfying non-controversial answer to the pure *measurement* question of how much did wages change post boatlift. Now what?

• Also, I agree it’s hard to come up with theories. But what should we do in the Mariel boatlift example to improve things?

A major question is how good is the data on wages considering the effect of a major black market. So, we should for example seek to estimate an unknown parameter that describes the distribution of wages and employment from the black market. Let’s take a look at the purchase of boats, the money put into law enforcement, the number of arrests, suspicious banking activity, cash and drug seizures, migration patterns within Florida, consumption of boat fuel… Put these together into a structural model that describes how these might behave in the presence of a dynamic ramping-up of an illicit market. Let’s also interview ex-cons who participated in the 80’s cocaine trade and ask them how it worked, what kinds of labor they employed, how much was required for different tasks, what the loading and unloading times were, how much product was lost to seizure, to sinking, etc… Build ourselves a model of how much manpower must have been required to carry out the level of activity we estimate must have been going on. We can hierarchically pool it with legitimate data on the construction trade, and legitimate shipping trade, and say sport-fishing using similar size boats. We can use physical constraints: you can’t move a full displacement boat across the water faster than its hull speed, so we can estimate the maximum transportation rate of cocaine for a given level of boating activity… etc etc.

Now we’re doing science: making connections between known facts about the world which allow us to predict physical quantities like kg of cocaine imported per month and number of person hours worked doing menial tasks associated… we can start to include economic factors by estimating wages from a combination of interviews of ex-cons and comparison of theoretical ideas of return on risk: people aren’t going to risk their lives importing cocaine to reliably make less than they would as a ditch digger for example…

• matt says:

Thanks for the reply, Daniel. I agree with a lot of that. Although I don’t think Angrist-style work is just “measurement”, but I don’t think there is much point re-hashing that debate.

Only point regarding Anoneuoid claiming to make his living in financial markets… does not surprise me. His writing reminds me of how I imagined Taleb to be when reading his books – everybody is an idiot except him. Also it’s not very impressive to make a living trading stocks.. if a monkey is doing the trading he’ll get market return, which, given enough starting capital will be just fine for most lifestyles.

• Anoneuoid says:

Only point regarding Anoneuoid claiming to make his living in financial markets… does not surprise me. His writing reminds me of how I imagined Taleb to be when reading his books – everybody is an idiot except him. Also it’s not very impressive to make a living trading stocks.. if a monkey is doing the trading he’ll get market return, which, given enough starting capital will be just fine for most lifestyles.

It is surprisingly easy. I don’t understand what other people are doing that I can make money so easily doing this. My returns are far above market returns (eg, up ~90% ytd while the S&P is only up 20%, which is typical). But that has nothing to with NHST (afaik).

And it isn’t that people are idiots. The people doing NHST are brainwashed, lazy, or scamming.

Can you share your example of successful use of NHST now?

• > Although I don’t think Angrist-style work is just “measurement”, but I don’t think there is much point re-hashing that debate

That’s fine, but I think you must be able to see that the kind of suggestions I made about how to attack the problem of a robust black market affecting our estimates of low skilled wages doesn’t look much like what is done in typical Econ research right? I mean, how many Econ researchers are writing equations regarding the rate at which people earn money to the cocaine trade across the marine border of Florida and basing it on structural models of number of boats available, the mean fraction of the time that boats are in use, the number of crew members on a typical boat, an estimate of wages paid based on interviews with ex-cons, data on arrests, data on DEA investment, etc etc etc.

I’m not saying it doesn’t happen, but I don’t recall anyone doing that kind of analysis.

What I do recall is stuff like this: https://statmodeling.stat.columbia.edu/2018/08/23/problems-published-article-foot-security-lower-mekong-basin/ where Economists concluded something so disconnected with reality if I remember it was very basic problems with the physics of flooding that would get you an F on an undergraduate midterm in Civil Engineering Hydraulics

NHST is fine. You guys just don’t understand it.

• Anoneuoid says:

NHST is fine. You guys just don’t understand it.

I’ve still yet to see a single real world example of NHST being used without a blatant logical fallacy being committed. Can you share one?

• Hypothesis testing is fine, it answers a well defined probability question in frequentist probability, unfortunately its applicability is far more limited than is generally believed, but it is a valid mathematical calculation.

Null Hypothesis Significance Testing is the application of a hypothesis test to answer a question usually no one believes is true to begin with (could my data have come from random number generator X(Y) whose parameter value Y is usually 0).

Following a negative answer to this question people routinely commit a logical error of assuming their preferred hypothesis is true (it didn’t come from X(Y) therefore it came from Z(Q) where sometimes but not always Z=X and Q = some estimated value for Y). Nothing is ok about that.

Following a successful rejection of a null hypothesis you can conclude one of any number of things:

1) The data do not approximate a random sequence

2) The data approximate a random sequence but from some other distribution

3) The data approximate a random sequence but from the same distribution with a different value of the parameter

4) The underlying data approximate a random sequence with the null value of the parameter but measurement bias, filtering, additive or multiplicative or correlated errors alters the measured data distribution….

etc etc

• matt says:

Daniel, how will you validate this structural model? That is, why should I believe that your structural model represents reality in any way shape or form? It can predict the effects of the Mariel boatlift well? So what. An ML algorithm could do the same. Presumably we will need to test the predictions of your model in another setting – one where there is a clear intervention. Where will you find that test? That is where angrist comes in. He has nice causal estimates that can be compared to the theoretical predictions of the overflowing stack of models in economics. Is this nonsense, or no? That is my main issue when you describe your modelling process: why should I take your estimates as serious structural parameters?

• AllanC says:

Matt,

I am not sure if you and Daniel are talking past one another or if there is an underlying logic or process in your training as an economist that has you unable to see what Daniel is saying. But to me he is talking about a basic model building process where the utility of the end result is pretty clear.

In mathematical modeling, the process for model development is quite simple (though rather difficult to apply well): formulate the problem, guess what a useful model might look like by incorporating known information about the system under study (such as physics, max capacities, etc.), compare that against available data taking into account known issues (such as measurement, transcription, etc.), if the model works reasonably well as defined by user requirements for predictive accuracy (and/or some particular loss function) then it is useful, if it doesn’t then go back to stage 2 and reformulate. The reason to have confidence in this process is because the model intentionally incorporates knowledge / constraints of system under study from its inception.

Will you ever formulate the correct model? Gosh no. There is no such thing as a correct model or correct parameter specification or correct parameter values. The only way to do that would be to create a 1-1 replica of reality with time forward capabilities.

If the model we conjured up in the process described above is only one of a set of other models with similar scoring according to our loss function, then we have to ask ourselves: is this process reliable? Well, I guess that is a question of faith. But I would soon rather trust the inferences from a model that was developed explicitly incorporating knowledge of the system under study then a statistical estimating procedure, given the same scoring according to our loss function.

• matt says:

Everything you wrote there Allan would seem to describe a model built for predictive accuracy. I think in physics predictive accuracy is nearly 100%, so prediction and causality appear to go hand in hand. That’s not the case in economics. It’s not about prediction, unless it’s predicting the effects of interventions.

• So one set of authors wrote the first Mariel boatlift paper, and a second set wrote the second one. Anoneuoid mentioned an important black market reason why neither estimate might be reliable. Both papers had opposite conclusions… evidently reduced form estimates don’t magically give you a reliable answer either right?

• matt says:

Ya the Mariel Boatlift and immigration more generally is a fraught literature. For what it’s worth, Borjas’ rebuttal is not good. He made some very weird sample selection decisions. Borjas has written about 300 immigration papers, with every single one finding a negative impact on native (i.e. American) workers. It is pretty clear that he massages the data to get the negative results at this point. I honestly don’t think you can learn much from correlating immigration flows and city-level wages; it’s simply too high-level to really understand what is driving the result. Plus cities aren’t isolated economies. So, poor decision on my part – I fell prey to availability bias (if that result replicated..).

In summary: what should I do with your model with all those moving parts that describes the Mariel Boatlift? Presumably it can give me some counterfactual predictions for what would have happened to a bunch of variables in Miami (e.g. low-skill wages) if fewer or more migrants had arrived on the boat. But again, how is this useful? I want to recover some parameters that can be generalized to other settings. All you’ve done is fit some complicated model to the data, likely making many functional form assumptions along the way. To think that this model could be used to predict the effects of immigration in any other context apart from the 1980 Mariel Boatlift is wishful thinking.

• matt, to the extent that a particular event has a lot of particular factors that only apply to that one event, a model of the one event can be rather non-generalizable. But there can be *components* of the model that are generalizable.

This is true in physics as well. Suppose you have a piece of plate glass, and it’s hit by a rock… from the resulting pattern of cracks you can perhaps infer some information about the speed and shape of the rock, and the manufacturing defects in the glass, and soforth… but in the end, you can’t use those specifics to predict where the cracks will be when a different shaped rock hits on a different trajectory to a different piece of glass… because those factors will all change and you can’t predict what they will be.

Nevertheless there can be portions of the model which apply to other pieces of glass, for example maybe the glass-air surface energy is consistent across all pieces of glass of the same chemistry… so we can use this information derived from the measurements of the first glass sample to predict say whether a given low-velocity rock has sufficient kinetic energy to cause a crack to form or if the rock will bounce off.

So, what can you do with the detailed Mariel boatlift model? It depends on which processes you included in the model and why they are important. Suppose for example you include factors involving language knowledge in low skilled areas that have to do with coordinating work in large crews that speak heterogenous languages. You might well use this “difficulty of employing heterogenous crews” parameter to estimate the effect of say bringing a large group of Arabic speaking refugees into a region of Italy or something… you expect that at low densities they have a difficult time being employed, but at high densities they can work together in large groups, so you predict that there will be concentrations of refugees in small numbers of towns where labor markets are able to take advantage of skills that people from the home-country tend to have… or whatever. We should be looking for commonality across events and try to explain multiple observations at once using a model that takes into account real-world stuff like this. Maybe there’s a parameter involving availability of communication technology that predicts the transaction cost in terms of time spent looking for low skilled labor opportunities and it has something to do with population density… if there are people in your area from your ethnic group then they tend to spread word of employment opportunities via text message, but if the density is low in your area, then the msgs come from too far away for you to be able to get there in time… etc

If you give up on even trying to understand the mechanisms by which immigrants interact together and with the various kinds of labor and housing markets to form communities… then you’re never going to discover these commonalities which may be occurring all around the world wherever there is a large flux of immigration.

• Anoneuoid says:

In case anyone wonders about the references to the “boatlift thread”: https://statmodeling.stat.columbia.edu/2019/09/02/he-says-it-again-but-more-vividly/

• Terry says:

Anoneuoid said:

“At worst it will lead to advising central banks and governments to do the wrong thing until we end up with nonsense like negative interest rate mortgages.”

What are you referring to? Do you mean the negative interest rates on German bonds? Those aren’t mortgages. Also, modestly negative interest rates on government bonds are not nonsense. Given that it is costly to transfer money forward in time via other means, a modest negative interest rate can be reasonable. Interest rates are a price, and like other prices are subject to supply and demand.

• Terry says:

Anoneuoid said:

“My returns are far above market returns (eg, up ~90% ytd while the S&P is only up 20%”

What sort of strategy produced these returns? I have never seen a strategy produce returns this large over a significant length of time.

• Terry says:

Daniel Lakeland said:

“So, we should for example seek to estimate an unknown parameter that describes the distribution of wages and employment from the black market. Let’s take a look at the purchase of boats, the money put into law enforcement, the number of arrests, suspicious banking activity, cash and drug seizures, migration patterns within Florida, consumption of boat fuel…” etc.

I find it hard to believe a model of this complexity would give reliable results. Why would any of these pieces be better estimated than the wage data in the existing papers? Can you give an example where a model of this complexity has been successfully executed? Perhaps agricultural models? You have talked about how unsuccessful structural macroeconomic models from the fifties were, so why would you think a model here would be more successful?

I’m not sure you’re wrong, but I’m very suspicious. I personally like such models, but I don’t have a good reason to think they are practical.

• Terry, I’m not aware of any such models in Econ. But I’m not an economist. I built a model like this to investigate the question of how best to invest in civil infrastructure resources in the presence of natural disasters, but I didn’t fit it to data. In the end this kind of work takes a lot of time and effort, and I’m not personally paid to do it. In the sense of not practical if you want to get tenure, or not practical if you want to convince a grant agency to find it, you might be right. But I think models like the one I worked on are THE ONLY way to address questions that are inherently complicated and have lots of real world moving parts. The whole reduced form stuff is IMHO mostly wanking as it can never address the question of what would happen in alternative scenarios and the estimates it gives are always retrospective “in the past at this particular time and place this thing happened… maybe”… what we want from science is causal predictions “if we do x now then y will happen around 3 years out”

• Anoneuoid says:

What are you referring to? Do you mean the negative interest rates on German bonds? Those aren’t mortgages. Also, modestly negative interest rates on government bonds are not nonsense. Given that it is costly to transfer money forward in time via other means, a modest negative interest rate can be reasonable. Interest rates are a price, and like other prices are subject to supply and demand.

It is just beginning now. Denmark seems to be the testing ground for this:
https://www.bloomberg.com/news/articles/2019-05-23/bankers-stunned-as-negative-rates-sweep-across-danish-mortgages
https://www.independent.co.uk/news/world/europe/denmark-bank-negative-interest-rates-millionaires-charge-a9074756.html

This will spread throughout the EU, then eventually to the US. That is why there is a “war on cash”.

“My returns are far above market returns (eg, up ~90% ytd while the S&P is only up 20%”

What sort of strategy produced these returns? I have never seen a strategy produce returns this large over a significant length of time.

I go long companies whose products I recently began to use who also have annoying but much larger competition (eg, AMD vs intel/nvidia), and short way overvalued companies whose products I used to use but stopped because they started becoming annoying by abusing their customers, pushing political agendas, etc (eg Netflix). Basically invest in what you know.

Generally if I am going to go short I wait for the top of the dead cat bounce rather than try to call the peak. Then if ever a macro event drops the whole market I use my profits from the shorts to buy more longs for cheap. Rinse and repeat.

Example: Right now I am waiting for TEAM (Atlassian) to dead cat bounce. They announced they fired all their “brilliant jerks”, and as a bitbucket user for years I have noticed they recently began needlessly redesigning the look of the page, adding feature bloat, etc. All the symptoms of a company culture going toxic. I expect to wait for a couple ERs before this really starts to show up in the financials, ideally (for that trade) a new bitbucket/github competitor will come out as well.

7. an0n says:

Is there actually good causal evidence for or against drinking whine when it comes to health? I think this would be rather difficult to credibly estimate with observational data. Variation in whine drinking is surely associated with a great many other choices that also potentially affect health.

• Kyle C says:

Even worse than that: researchers are trying to correlate outcomes with what people *say in surveys* they consume. Which adds a whole other level of confounding.

My working theory, which IMO continues to hold up, is that in any given population the diet that appears to “prolong life” will turn out to be the diet that the richer people eat. In the US, who says on a survey that they drink one glass of wine with dinner, no more, no less, no beer, no liquor? An upper middle class or rich person. They live longer. But it’s not the wine. Or the kale.

• Martha (Smith) says:

“drinking whine”?

8. Steve says:

I remember when my son needed surgery. The surgeon went over a number of statistics with me about risks and possible outcomes and recommended the surgery. Then, he told me that the opposite recommendation would be given in Europe, where they would wait for signs of damage first before the surgery. Essentially, the issue was to either do the surgery to prevent damage with a risk that the surgery could cause the damage or wait until you see signs of damage (which may never occur), and do the surgery to prevent more damage. The surgeon, who was very smart, was frank about the fact that the difference between the U.S. and European recommendations came down to nothing more than cultural bias. The U.S. had a prior that surgery was less risky, but they were all looking at the same data. I think this happens all the time in medicine.

• jim says:

“The U.S. had a prior that surgery was less risky”

I wonder if that prior is about patient risk or doctor revenue.

The US medical community is highly intervention oriented. They almost always recommend action over no action.

• Overdiagnosis is a real precarious issue. Plus the fact that doctors may overvalue the value of test results: including not countenance the distinction between absolute and relative risks. More fundamentally the way the test results are presented are somewhat rudimentary, imo.

• Erling says:

Working as a healthcare professional in a scandinavian country (with a universal single payer system) that’s my impression as well. I’ve in a few instances had patients with a treatment history from the US and very good health insurance coverage, and been puzzled by what to me seems like overtreatment and excessive assessment. But more is clearly not always better in the case of healthcare.

9. I speculate that cutting the amount of food eaten is one of the best advice, least processed and organic. Sydney Wolfe told me that in a telephone conversation.
The second is cutting out sugar as feasible. I admit that I can’t resist a few cookies daily, a dietary indulgence I hope to abandon. High glycemic indexed foods also contribute to metabolic conditions. My metabolic profile is very good, therefore. I may just have good genes too.

What I don’t subscribe to is the ‘everything in moderation’ mantra. I am chagrined to say that most of us have over-indulged.

In so far exercise; that is a complex topic. I exercise way more now than I did 10 years ago b/c I want to maintain my aerobic & anaerobic capacities. The latter is key to better heart health, provided you don’t have other physical constraints. It is much harder as you get older. I prefer anaerobic and strength training. Now I’m concentrating on agility and flexibility by dancing and stretch banding.

• Vegard Lysne says:

Could you provide any good arguments to chose organic on a general basis?

• Joshua Pritikin says:

In terms of eating the finished product, I don’t think there is a big difference. But farm workers suffer in the application of pesticides. Better to choose organic for the sake of farm workers.

• Why would there not be much difference in the finished product?

• Vegard Lysne says:

Well, organic farming also use pesticides, but not the same as in conventional farming. In many situations the organic pesticide is far more toxic. So I wouldn’t be to sure that conventional farming is necessarily worse for the farm workers.

The nutritional differences in the finished product is at best minimal, and not at all consistently in favor of organic, the effects goes in both directions for different nutrients.

• Joshua Pritikin says:

> In many situations the organic pesticide is far more toxic [than the conventional].

This is news to me. Please provide an example.

10. Andrew Halim says:

Nutrition research is amazingly noisy.
https://kresserinstitute.com/the-fundamental-problem-with-most-nutrition-research/

Do not drink wine for the health benefit. Drink them if you like it, in moderation.

Let’s generalize it:
Do not consume [food product] for the health benefit. Consume them if you like it, in moderation.

11. Not Trampis says:

Personally i have always taken the apostle Paul’s advice to his young friend Timothy . drink a little wine for your stomach’s sake. good advice and godly. A win/win situation although whether it is Bayesian is problematic

12. oncodoc says:

Both doctors were acting on their priors. Modest wine drinking is good for your heart. Americans drink less than 3 gallons per year; the French drink more than 21 gallons. Both doctors assumed that you were an average consumer. Thus the differing recommendations.

Back in the late 70s, I remember my dad telling my uncle that the doctor said that he (my dad) should cut back on his drinking, that drinking a 12oz can of Bud per day doubled his risk of colon cancer. My uncle, a steelworker, told my dad, a CPA, that they should be ok since they only drank 16oz cans of Bud. They’re both in their mid-80s now doing fine although they’ve switched from beer to scotch.

14. Tom says:

I remember seeing a cartoon (I think it was in New Scientist) where someone holding two glasses of wine was explaining – ‘you see, this glass of wine extends my life by 15 minutes and this one simply brings it back to where it was before’

15. Joshua Pritikin says:

TLDR: Grapes are better than dealcoholized red wine is better than regular red wine

“If eating berries with a meal decreases inflammation, what about drinking berries? Sipping wine with your white bread significantly blunts the blood sugar spike from the bread, but the alcohol increases the fat in the blood by about the same amount. If you eat some cheese and crackers, this is the triglycerides bump you get. If you sip some wine with the same snack, they shoot through the roof. Now, we know it was the alcohol, because if you use dealcoholized red wine (nonalcoholic wine—the same wine, but with the alcohol removed), you don’t get the same reaction. This has been shown in about a half-dozen other studies, along with an increase in inflammatory markers. So, it may help in some ways, but not others.”

Both the media and the experts have pretty much done the field in. With the exception of a few examples I mention in the post below and some others, studies like those read by the cardiologist are simply too noisy and biased to be useful. They have zero business even being in the same sentence as the word “policy.” We need better measurements and better analyses

https://lesslikely.com/nutrition/nutritional-epidemiology/

• jim says:

+1. Exactly.

• Terry says:

+1

There is a basic human drive to assure the purity of one’s food. Back when dinosaurs used to spit on our food (joke) there were good reasons for this and the purity instinct had a big positive payoff. Nowadays, the purity instinct has much less to work with. (Smoking, alcoholism, extreme obesity and a few others.) So, the purity instinct expresses itself more and more on things like fish oil, gluten, and organic kohlrabi. Maybe some of these things make a little bit of difference, but not much. Humans gonna human.