Martha Smith writes:
Yesterday BBC News Magazine featured an article by William Kremer entitled, ”Why are placebos getting more effective?”, which looks like a possibility for a blog post discussing how people treat surprising effects. The article asserts that the placebo effect has been decreasing, especially in the U.S.
The author asks, Why? What could it be about Americans that might make them particularly susceptible to the placebo effect? then gives lots of speculation. This might be characterized as I believe the effect is real, so I’ll look for possible causes.
However, applying the skeptical maxim, If an effect is surprising, it’s probably false or overestimated, I quickly came up with two plausible reasons why the increasing effect of placebos might be apparent rather than real:
1. The statistical significance filter could operate indirectly: One reason a study comparing treatment with placebo might get through the statistical significance filter is because it happens to have an uncharacteristically small placebo effect. Thus small placebo effects are likely to be overrepresented in published studies; a later replication of such a study is likely to show a larger (but more typical) placebo effect.
2. If early studies are not blinded but later studies are, the earlier studies would be expected to show deflated effects for placebo but inflated effects for treatment.
My reply: There’s something about this placebo thing that just keeps confusing me. So I’ll stay out of this one, except to post the above note to give everyone something to think about today.
Re, one common topic here. The figure in the article is an example of bad graphics design. I have defective color vision and I found it difficult to match the lines to the legend.
Hard to understand this statement:
“The article asserts that the placebo effect has been decreasing, especially in the U.S.”
If as William Kremer says, “placebos getting more effective” then linguistically it appears the placebo effect is increasing. Or, is the “placebo effect” defined as the difference between placebo and treatment and thus, the placebo effect is decreasing?
Thanks for pointing out the garbled writing. It should have been “increasing,” not “decreasing.”
But I could have been clearer still, so here’s a belated attempt:
The title of the article asks, “Why are placebos getting more effective,” which appears to be asserting that the placebo effect is increasing.
But the first paragraph says,
“Research shows that over the last 25 years the difference in effectiveness between real drugs and these fake ones has narrowed – but more in the US than elsewhere. Are Americans really more susceptible to placebo effects, or is something else going on?”
Then a few paragraphs down, after describing what “placebo effect” is, he writes,
“No-one knows why the placebo response is rising but a fascinating new study in the journal Pain might help experts pin it down.” So he sure seems to have made the leap from “difference between placebo and drug effects increasing” to “placebo effect decreasing,” which is what struck me as poor reasoning.
I think it really means that effect size in placebo controlled studies is decreasing. Maybe. But it seems like word salad.
Rather than the placebo getting more effective, the intervention could be becoming less effective. This appears to be occurring in randomized trials of pre-K programs:
Well, saying “the intervention is becoming less effective” is the same saying that “the effect size is decreasing.” But in terms of why programs decrease in effectiveness as they grow (the “scale up” problem) there was some really interesting discussion of data about selection bias in early intervention sites on this blog last year.
Senn argues that the placebo effect may simply be a consequence of regression to the mean. If participation in a randomized trial requires minimum performance on an outcome measure prior to randomization, then one might expect “improvement” across the board due to regression to the mean. If inclusion criteria are more relaxed now, than in the past, or if outcome measures have lower within-subject residual variability now, than in the past, then the placebo effect will be smaller.
Hey, I tried.
Makes sense, but it were so simple you would think this was figured out long ago. On the other hand, I remember a paper from awhile back that most studies don’t even report what is in the “placebo”, can’t find it at the moment. I guess it is possible the study of and reporting on placebo effect has just been so sloppy that no one could figure out what is going on.
Regression to the mean is certainly part of the placebo effect. But it may not be all of it. I can’t at the moment recall specific references, but I have seen three arm trials, active drug vs placebo vs no intervention, where the placebo group does better than the no intervention group.
While psychological effects may play a role, particularly if the outcomes involve self-assessment by the subjects, or unblinded rater-assessment, there could be other things. For example, it may be that just interrupting your routine 3 times a day to take a pill may have some effects, even if the pill contains nothing biologically active. These “collateral” aspects of the real intervention may play a role in the placebo effect.
At the end of the day, though, I don’t think it’s well understood.
If the trial is run properly, shouldn’t regression to the mean balance out? Both the placebo and treatment group enter the study at a particularly low point, both get some RttM bounce-back, but the treatment group also gets any potential benefit of the treatment. If pre-treatment measures are reported for all groups, it seems like it would be straightforward to notice any baseline differences that could lead to different amounts of regression.
It also seems like some studies/measures should be fairly immune to RttM. For example, I can’t imagine a lot of people spontaneously get better in terms of ‘amount of cancer’ or ‘fat jammed into arteries’.
As I understand it, the Placebo effect is simply observed average improvement in the placebo group, regardless of what happens in the active group. Yes, regression to the mean balances out, but it will make placebo subjects appear to get spontaneously better.
Observed average improvement relative to non treatment + non placebo.
If you just have two groups, placebo and treatment you don’t know how to disentangle the impact of doing something (taking a pill, interrupting your day, believing that the treatment will work) from the impact of doing nothing (the pill itself being inert).
” I can’t imagine a lot of people spontaneously get better in terms of ‘amount of cancer’ or ‘fat jammed into arteries’.”
But you would be wrong about that. Don’t forget that amount of cancer and “fat jammed into arteries” (which would typically be measured as the percent of cross-sectional area of an artery occluded by plaque) are subject to measurement error. The included group will, on average, have positive measurement errors, which will then regress to 0 at follow-up. The “true” amount of cancer or arterial occlusion may not change, but we have no way to observe that.
These are not small effects, either. Radiologic measurements like these are sensitive to the positioning of the patient within the equipment, the angle the beam makes with the lesion as it passes through, interference by surrounding tissues, patient movement artifacts, and many other sources. Even short-term test-retest studies of these kinds of measures typically show them to have pretty low reliability. The lower the reliability, the greater the regression to the mean.
great clickbait title
This is a very interesting question. Most of the prior literature on placebo effects (note my use of the plural) and what they mean about effective treatment or not for various human diseases is tracked by a subset of medical doctors, especially those who investigate “science-based medicine.” The Science Based Medicine group-edited blog will be interesting reading for anyone interested in learning more about placebo effects. For statisticians and statistics educators, the popular press articles about placebo effects (usually mistakenly assumed to be singular, just one kind of effect) are a treasure trove of teaching examples about mistakes in study procedure or mistakes in statistical reasoning.
For much more about placebo effects than I can cram into one blog comment, see the recent articles searched up by keyword “placebo” on the Science Based Medicine site (I’ll provide the link). My own general comment about placebo effects research is that it is clear that there is no such thing as an effective placebo treatment for any major human disease, and no evidence at all that placebo effects have a meaningful effect size except in regard to patient self-reported subjective symptoms such as pain or nausea.
So thinking about other recent examples where there were changes observed in the US population over a relatively short time (about 20 years in this case)… it’s interesting to propose some post hoc explanations of an observation that involves changes in American culture, drug advertising, the relationship between nurse and patient and so on, what if it might be something a bit simpler. For example … what if the age mix of trial participants changed slightly (this would almost certainly be true in the US)? Or what if the make up of the drugs studied change slightly (e.g. if the proportion of trials for a particular class of drugs like statins or SSRIs) went up or down by a slight amount. Or what if even there was a slight change in the estimation procedures e.g. even that SAS released a new version of something and it it created a small but cumulatively meaningful shift in p values. Or the FDA changed some rule on inclusion criteria.
It seems to me that this is an example where people are reading a lot into a correlation with time. The reading in that they are doing is about as valid as me commenting on the physiology of pain and speculating about the possibility that neuroreceptors have changed over a 20 year period.
If over the past decades the symptom threshold to receive medical intervention reduces couldn’t that potentially lead to a rise in observed placebo effect?
e.g. Its hard to see a placebo resolving excruciating stomach pain but more likely to resolve milder pain?
That’s a good point too, not to mention that the emergence of other pain killers for chronic pain but this was part of the selection criteria “) patients with specific neuropathic conditions: brachial plexus
avulsion, cancer-associated neuropathic pain, chemotherapy- induced peripheral neuropathy, chronic low back pain with a neuropathic component, central (poststroke) pain, complex regional pain syndrome (CRPS type I), Guillain–Barre ́ syn- drome, HIV-associated neuropathic pain, painful diabetic peripheral neuropathy (PDN), postherpetic neuralgia (PHN), posttraumatic neuralgia (CRPS type II), small fiber neuropathy, or RCTs with mixed neuropathic pain patients with diagnoses including the above” so no mild stomach pain. Still the HIV inclusion makes you wonder if the steady improvement in HIV care plays a role.
Well, as in my earlier comment, you have to consider measurement error here. Pain is particularly difficult to measure with precision. Typically it’s done using either a visual analog scale or a 1-10 smiley-frowny face scale. While it would be unusual for a placebo to completely resolve severe pain, regression of the positive measurement errors to zero would still occur, so that some degree of apparent improvement would typically be observed.
Also, pain, whether severe or mild, has inherent fluctuations in intensity which may reflect variation in the underlying causative condition or other variable influences such as distraction, anxiety, level of consciousness, as well as reasons we just don’t understand. Again, on average these factors will tend to be more in the direction of increased pain in a group selected by a threshold, and with the passage of time they will on average regress as well.
I think my lowering-of-thresholds explanation applies to things beyond pain. It should apply to non subjective measures as well.
My point is something like this: We already know that some symptoms / conditions are self-limiting or resolve by the body’s natural defenses even without intervention. But if such self-resolution is more likely to happen the weaker the ailment or symptom then a reduction in the threshold we use as gatekeeper of intervention will naturally make it seem that the placebos are working more often.
Assuming we agree that we have over the decades started treating more aggressively and a broader list of conditions it seems natural to expect that placebos will seem more effective?
I’m still inclined to favor the reasons (1) and (2) Andrew quoted above.
Leon Shernoff’s link here http://statmodeling.stat.columbia.edu/2016/01/12/cancer-statistics-wtf/#comment-261406 brings to mind one (of many ways) reason (1) might operate: if only positive results are reported, then ….
See also Keith’s response to Leon’s question.
It’s an interesting topic.
I’m not sure how reason (1) would explain why the placebo arm response has increased over time, though. It would explain -within a particular drug development program- why the placebo response of a phase III trial was greater than the successfully phase II trial that inspired it, but not why overall the placebo response across all US studies for neuropathic pain has increased in the past 20 years. However, I think its plausible that pharma companies are better at publishing the full results of failed studies now than in the past, which could be one factor in apparent increase in observed placebo effect over time.
Reason (2) isn’t really applicable, as unblinded placebo studies must be extremely rare, as being unblinded largely undermines the point of having placebo control. And also, as for (1), I wouldn’t expect that to explain how placebo response increases across the US over the last couple of decades. I haven’t read of any unblinded placebo controlled pharma trials before. Though, if these did actually happen in the past and have been phased out over the last 20 years then I guess “proportion of placebo patients on unblinded placebo” over time could be a factor affecting the apparent placebo effect trend if the deflated-unblinded-placebo effect theory part of reason (2) is true.
I read the Pain article this morning, and there are some aspects of the analysis that if I was reviewing it, I would have wanted changed. There are definitely intriguing graphs showing the actual data (not just the least squares lines the way the BBC has them) that highlight in an exploratory way the US/elsewhere contrast and apparent change over time. They also explore a lot of interesting possible explanations for the data using bivariate correlations and OLS with backward selection, and as such the paper is really an exploratory analysis more than anything else. Some explanations for the apparent difference by region are looked at and speculated about. The paper really focuses on the explanation of increased length of follow up and increased sample size in the US studies as the likely explanation. They also describe the brain studies on placebo effects (really interesting) and why the observed differences can’t really be due to those.
The authors do not speculate at all about changes in US culture or the existence of direct to consumer drug advertising as the explanation; that seems to come entirely from journalists. What they do speculate on is about whether advertising for study participants and in general the US shift to CROs would have caused the observed change, which I thought was a weak hypothesis but it was fairly presented as just something else that had shifted during the time span. Overall very interesting data and good at avoiding overreach. It’s not their fault that journalists come up with these pop sociology explanations. They do say (but don’t show the data on) that changes in demographics do not explain the observed time data.
That said, I think the analysis is pretty limited. I don’t want to get into the nitty gritty but first if you have two distinct regions like this, using a dummy to represent geography and then essentially speculating about many possible interactions just cries out for some variation of HLM / mixed model even more so when you are using time as an explanation. Just showing year as a linear effect in a scatter plot and calling it a trend is not an adequate way to understand whether there was change over time, especially when the high ranges of study length and study size are overwhelmingly US based. Also why assume that the impact of any change that might exist would be linear rather than plateauing after a certain point?
Also, given that they have a very strong speculation about the increase in study length as the substantive explanation for both the decline in effect size and the US/elsewhere gap, it seems to me that they need to think about modeling this, specifically the impact of truncation on the shorter studies. For example since they have the time data from the studies they could analyze placebo effect on data all truncated at the same length … further if they have all this time based data from the studies why not treat study as a unit and time measurements as observations and use a mixed model to try to capture the impact of time using all 81 trials as though they contain censored data for example?
Just looking at the data presented it seems like a great data set for someone to dig into and show how all these structures of autocorrelation, clustering, differential censoring and place are tremendously important.
Thanks, Elin! Since I wasn’t able to get a copy of the article, I especially appreciate your mentioning some of the things I was wondering about (like the simplistic looking graphs in the news article!). Your idea of looking at truncated data from the newer studies to compare with the older studies sounds like an interesting idea.
I think I see your point in being skeptical of my point (1). I guess what I wrote didn’t explain well what I had in mind. A better expression might be: Since Ioannidis’ 2005 paper and follow-ups have brought attention to how statistical artifacts (such as the statistical significance filter) can produce misleading results, recent papers are more likely to give more accurate estimates of effect sizes. For example, a study comparing treatment with placebo might get through the statistical significance filter is because it happens to have an uncharacteristically small placebo effect. This may be less likely to happen now than twenty years ago.”
Re your comment on point (2): I am too skeptical to believe that “unblinded placebo studies must be extremely rare, as being unblinded largely undermines the point of having placebo control.” In reality, effective blinding involves careful attention to detail to avoid the patients’ knowing which group they are in. So a study that is labeled blinded might not actually be; but if recent papers are better blinded than older ones, this could produce misleading estimates of effect sizes.
Unfortunately, I haven’t been able to get a copy of the original Tuttle et al paper to see how much Kremer’s article is true to it, and to make an informed critique of it. I did find a critique at http://www.bodyinmind.org/placebo-responses/ that discusses publication bias, regression to the mean, and other possible reasons why the claimed conclusions might be misleading – but cautions about attributing just one cause.