A well-publicized example of problems with observational studies is hormone replacement therapy and heart attack risks for postmenopausal women. In brief, the observational study gave misleading answers because the “treatment” and “control” groups differed systematically. Could the method of propensity scores have found (and solved) the problem?
Hormone replacement therapy and heart attacks
The evidence from the Women’s Health Initiative, a randomized clinical trial from the 1990s, is that hormone replacement therapy increases the risk of heart attack in older women. (Here’s a summary from the American College of Obstetricians and Gynecologists, which I found at the entry for Hormone Replacement Therapy at the National Library of Medicine).
Confusion from the observational study
The above findings surprised people because observational evidence from the Harvard Nurses Health Study found that women who used hormone replacement therapy had a lower risk of heart attacks. For example, in an article in the Harvard Health Letter from October, 1997:
The latest report from the Nurses’ Health Study speaks to some of those issues. In this ongoing investigation, begun in 1976, the researchers examined the impact of long-term HRT use in more than 60,000 nurses. They looked at length and continuity of hormone use and how this affected the women’s death rates. They also studied women who used estrogen alone or in an estrogen/progesterone combination and adjusted their data to account for smoking, weight, exercise, and other lifestyle habits.
Overall, the researchers found that the death rate among HRT users was 37% lower than that of women who had never taken hormones, primarily because the hormones appeared to protect women against heart disease. Indeed, the risk of dying of cardiovascular disease was 53% lower in the HRT group.
But since then, attitudes have changed. For example, from the NIH:
Do not use estrogen plus progestin therapy to prevent heart disease. The new findings show that it doesn’t work. In fact, the therapy increases the chance of a heart attack or stroke. And it increases the risk of breast cancer and blood clots.
The Nurses Health Study seems to have struck out on that one! The women who took HRT were apparently quite a bit different, on average, from those who didn’t–even after “controlling” for background variables.
But is it possible that, if the data from the Nurses study had been analyzed using propensity scores (see here for a description of the method), that more reasonable claims would have been made from the beginning?
The Nurses Study continues to operate and make headlines, so this is still a live issue.
I was writing about matching and propensity scores because these are the adjustment methods most familiar to me. The question could equally be asked about other methods, such as g-estimation (see the comments of Jamie Robins at this 2003 meeting).