I don’t often read the Iranian Journal of Cancer Prevention, but I like this quote:
I was thinking more about the PACE trial. God is in every leaf of every tree.
There’s been a lot of discussion about statistical problems with the PACE papers, and also about the research team’s depressing refusal to share their data. (As E. J. Graff, our editor at the sister blog put it, you funded these clinical trials, but you’ll never know what they found.)
But today I want to talk about something slightly different, which is the flawed way that research results get interpreted and translated into policy.
There are big problems here, which I attribute in part to one of our favorite villains, deterministic thinking.
The PACE study is a randomized controlled trial of alternatives to treating chronic fatigue syndrome, and the main finding of the study was that cognitive behaviour therapy or graded exercise therapy were more effective in reducing fatigue and improving physical function than adaptive pacing therapy or nothing at all. All treatments also included specialist medical care. See here for descriptions of the treatments. The basic idea is that CBT and exercise therapy are about getting the patient moving again, whereas adaptive pacing therapy and specialist medical care are about managing the condition.
The follow-up paper includes a bunch of mediation analyses that, as usual, I don’t understand, but that doesn’t really matter. The key point, to take their claims at face value for a moment, is that they found that talk therapy and exercise worked in a randomized controlled trial, and this led to three larger conclusions: First, that these are the therapies that should be recommended for the general population of chronic fatigue syndrome sufferers, and that these are the treatments that should be paid for by government and insurers; Second, that when people with these conditions complain about pain and exhaustion following exercise, the solution is for them to get rid of the negative thoughts and continue exercising; Third, that, as the investigators write, “the results support a treatment model in which both beliefs and behaviour play a part in perpetuating fatigue and disability in chronic fatigue syndrome.”
Or, as chronic fatigue syndrome sufferer and science writer Julie Rehmeyer put it, “according to the theory underlying this psychiatric research, my problem was that I was out of shape, afraid of exercise, and obsessed about my symptoms. The path to wellness was to drop the idea that I had a physical disease and steadily increase my exercise, no matter how bad it made me feel.”
So. My problem is in the conclusions being drawn from the study. I see the following chain of reasoning:
1. It’s a randomized controlled trial, perhaps the only large study of this sort on chronic fatigue syndrome. Results are (I assume) statistically significant.
2. This is the gold standard: a statistically significant effect in a randomized trial. I’ll assume for now that the design and analysis were clean. There’s some dispute about this, in particular regarding the appropriateness of the outcome measures, but for now let’s just take the empirical clinical finding as true. Just speaking generally, CBT and exercise therapy are both good things, so it is certainly plausible that they would help on average this group too.
3. Evidence-based medicine. If CBT and exercise therapy are the only proven treatments for chronic fatigue syndrome, then I see the argument for (a) recommending these treatments, and (b) being reluctant to pay for unproven alternatives.
4. Conclusions about how the disease works. If talk therapy cures it, the condition must be psychosomatic. If exercise therapy works, it must be that the people with this condition could get better, if they didn’t have these mental blocks.
What went wrong? Lack of understanding of variation. As noted above, it’s no surprise that CBT and exercise therapy can help people. Great. But lots of people with chronic fatigue syndrome will have big problems even if they do get CBT and exercise therapy. Even in the published study, the success rate was far below 100%. The success of these therapies for some percentage of people, does not at all contradict the idea that many others need a lot more, nor does it provide much support for the idea that “fear avoidance beliefs” are holding back people with chronic fatigue syndrome.
At this point, you might feel that I’m being too harsh on the PACE study. If a randomized controlled trial won’t satisfy me, what will? My response is that you have to take the study for what it is, you can’t assume that the positive features of the randomized trial will somehow lend credence to extrapolations, and you can’t assume that just cos a treatment does something for some people, that you’ve learned some general principle. Not for something like chronic fatigue syndrome which is, presumably, some mix of different conditions.
Don’t ask, “Which one is right?”
A commenter in yesterday’s thread picked up this revealing quote from Lancet editor Richard Horton:
[A]daptive pacing therapy essentially believes that chronic fatigue is an organic disease which is not reversible by changes in behaviour. Whereas cognitive behaviour therapy obviously believes that chronic fatigue is entirely reversible. And these two philosophies are kind of facing off against one another in the patient community and what these scientists were trying to do is to say, ‘Well, let’s see. Which one is right?’”
No no no no no. It’s just a radio quote so I’m not going to criticize Horton for saying that a therapy “believes” something; his meaning is clear. My problem is with his attitude that it’s one or the other.
“Which one is right?”, he asks, but this is not in general a good question to ask.
There are two reasons this is a bad question:
1. Chronic fatigue syndrome is a diverse condition, or set of conditions lumped under a common diagnosis. It is completely reasonable to think that different therapies could work for different people, and that the condition has different sources for different people. Indeed, even the PACE trial itself found many people did not improve under their treatments.
2. Even for any particular person, the condition could have a mix of causes and be amenable to a mix of therapies. So the attitude that it’s one or the other, can be a serious mistake even for a given patient, let alone when trying to characterize a broadly-diagnosed syndrome in the general population.
These points are relevant and important even if the published PACE trial had no flaws at all. That is, I believe Horton is seriously mistaken in his argument, even setting aside all the concerns about those published papers and even setting aside the refusal to share the data.
Making a categorical assertion about something broader than the specific research hypothesis is quite common. Assert that big X is “true” or “real” based on specific finding y, with little regard to effect sizes, variability, or limitations. I’m going to assume that if pressed Horton would retreat to defending the efficacy of the specific treatments based on the trial. The tendency to generalize and become categorical, however, is pervasive as many discussion sections are distressingly untethered from the corresponding results and analysis sections.
I agree that there’s a problem with making general statements about CFS patients when the CFS label undoubtedly covers a range of diseases but that’s the least of the problems with the PACE trial.
Andrew, you said that “The key point is that they found [in PACE] that talk therapy and exercise worked in a randomized controlled trial” but I’m afraid they didn’t. They ran an unblinded trial and used subjective measures for their main outcomes. It’s hardly surprising that patients rated the experimenters’ favoured therapies as more effective. On objective measures, the trial was a clear failure (but you won’t read about that in the Lancet paper – the experimenters eked those findings out over several years’ worth of publications). And at long-term follow up, there’s no difference even in self-report between those who got a talking/exercise therapy and those who didn’t.
If you wanted to run a psychology experiment to test the effects of social pressure on patients to favour the experimenters’ favoured treatments in their self-ratings, it would look exactly like the PACE trial. It simply cannot be taken seriously as a trial of therapeutic interventions.
PACE was a big fat £5m failure but very few seem able to face it.
My point was that these conclusions did not seem to follow, even taking the study’s claims at face value and setting aside any methodological criticisms. I added a phrase to the above post to clarify.
Thanks, Andrew, I appreciate your taking the trouble to do that.
If you’d like to go the whole hog, you might like to extend that change to this early para: “The PACE study is a randomized controlled trial of alternatives to treating chronic fatigue syndrome, and the main finding of the study was that cognitive behaviour therapy or graded exercise therapy were more effective in reducing fatigue and improving physical function than adaptive pacing therapy or nothing at all.”
“Main finding” could be replaced by “main claim by the authors”, perhaps.
I agree completely that the conclusions don’t even follow from the face-value results – I’d just like to see the face value of the results being challenged more often, and Richard Horton acknowledging the overwhelming force of that challenge.
The naked emperor that the PACE trial represents has been running down the street quite long enough. It’s Horton’s job to stop him. I wish he would.
Motte and Bailey?
Thank you for that link. It was hilarious. I’ve bookmarked it.
I’ve been in agreement with you through all of these PACE discussions – but this time I must protest. Are you saying that due to variation in the disease and treatments, virtually all potential treatments should be funded? The fundamental question I have is that there are real decisions that must be made in the face of uncertainty. The answer cannot be that uncertainty means that anything is legitimate and anything is worth paying for. Now I know you aren’t saying that. But the question then becomes how much evidence is required to permit a treatment to be paid for – and/or how much evidence is required for a treatment to not be paid for? While the evidence is always indeterminate (I’m all for getting rid of deterministic thinking), decisions are often (perhaps always) deterministic. I think it is an important shift of thinking to view these deterministic decisions as more malleable and constantly changing in the face of new evidence – but I don’t think that avoids the need to make deterministic decisions. Your reasoning seems to me to suggest that we need not make such determinations.
No, I am not saying that due to variation in the disease and treatments, virtually all potential treatments should be funded. Decisions must be made, and I recommend making these decisions based on cost-benefit analyses with explicit statements of uncertainty.
Fine, I can go along with that. But it is worth thinking about where this line of policy will lead us. A careful cost-benefit analysis explicitly based on the uncertainty of current knowledge, combined with open data and continued studies of treatments and effects, results in a much more uncertain business prospect for drug firms (and other treatment providers). I do believe that is a more realistic and productive environment and actually reflects how knowledge evolves. But it is not without its own consequences. As we make drug research (to use a concrete example) more risky (the increased risk is that when a drug is “approved” it will not have 7+ years of guaranteed marketability, but subject to continued re-evaluation) the result is likely to be less private investment in drug research – or at least a move towards more inexpensive and short-term investments. Also, there will be a tendency for drug firms to merge (already happening) as ways to mitigate the risks. I suspect that only a few large drug conglomerates will survive. Then there is the issue of who will finance continued study and long-term research. In an ideal world, this is perhaps the role for government (it used to be places such as the NIH but that has ceased to play such a role). Most economists (that is my training and I will guess what most of my colleagues would say) would prefer a more competitive private market – and if the choice is between a government provided medical research sector and a private one with only a couple or three firms, then all choices will be quite imperfect.
I won’t try to guess the outcome, but I think as a policy matter we need to think carefully about the consequences of moving from a “deterministic” mindset to one that is more evolutionary and uncertain. I personally believe the net effect is positive, but unintended consequences are almost guaranteed and it pays to think about these ahead of time.
We’ve had discussions on this blog recently in which I’ve argued, and a few (such as Keith O’Rourke) have piled on, that we should end the process of DRUG APPROVAL and move to a process of meeting a minimal standard for safety (the drug isn’t actually severely toxic) and an additional constantly updated independent analysis of various “metrics”.
publish an estimated curve for the frequency of moderate side effects at standard doses through time (ie. onset after N doses), for severe side effects or toxicity, a curve for percentage of patients “cured” after N doses (if that’s an appropriate thing to consider, like for an antibiotic or anti-fungal or something) or percentage of patients exhibiting some important primary outcome (cancer remission, end of auditory hallucinations, return of ocular pressure to within normal tolerances, etc etc obviously custom designed for whatever the drug is supposed to do)
Let patients and doctors work together using this publicly available data published in a single location (FDA website?) based on independent (ie. not operated by the drug manufacturer) trials and continued monitoring.
For SURE we would have a better selection of drugs, lower risk to drug research (even if your drug turned out to be slightly problematic, it’s still possible it might have appropriate uses, where the benefits outweigh the problems).
There is no one-size-fits-all cost/benefit analysis, so it makes more sense to provide the data on benefits/side-effects/costs and let the doctors and patients work these things out in their specific context.
“We’ve had discussions on this blog recently in which I’ve argued, and a few (such as Keith O’Rourke) have piled on, that we should end the process of DRUG APPROVAL and move to a process of meeting a minimal standard for safety (the drug isn’t actually severely toxic) and an additional constantly updated independent analysis of various “metrics”.”
100% this. The FDA should not be in the business of assessing drug effectiveness at all and in fact was never funded to be able to do so effectively. The original job of the FDA was to make sure the ingredients were described appropriately (adulteration and misbranding). It would be best if their responsibilities were limited to the original mandate, although I doubt many would complain if some government organization provided safety guidelines. Check out “FDA Science and Mission at Risk-Report of the Subcommittee on Science and Technology”:
“When the Federal Food, Drug, and Cosmetic Act was originally enacted in 1938, the regulatory and compliance issues FDA faced were comparatively simple. From that modest beginning, however, FDA’s role as gatekeeper to new products has expanded enormously13. Through the enactment of a series of landmark statutes, beginning in the 1950s and extending through the 1970s, FDA was given a mandate by Congress to review and approve prior to marketing, the safety of color additives, human food additives and animal feed additives, as well as to review and approve the safety and effectiveness of new human drugs, new animal drugs, human biological products and medical devices for human use. As a practical matter, today no new pharmaceutical product or medical technology can be used in the US without FDA first determining that it is safe and effective for its intended use. In 1990, Congress added pre-market approval for disease prevention and nutrient descriptor claims for food products, and in 1994 it added pre-market review for newly marketed dietary supplements.
FDA’s responsibilities have continued to expand. During the past two decades Congress has enacted 125 statutes that directly impact FDA’s regulatory responsibilities — an average of more than six each year —in addition to the core provisions of the 1938 Act itself and its amendments from 1939 to198714. Each of these statutes requires some type of FDA action. Many require the development of implementing regulations, guidance or other types of policy, and some require the establishment of entirely new regulatory programs. Virtually all statutes require some type of scientific knowledge or expertise for the Agency to adequately address them, and in some cases may require laboratory research. Yet none of these statutes has been accompanied by an appropriation of the new personnel and increased funding necessary to enable adequate implementation. In fact, during the same 20-year period from 1988 to 2007, while faced with 123 new statutes, FDA gained through appropriation only 646 employees — an increase of 9 percent — and lost more than $300 million to inflation15.”
Good comment. Suppose we have three groups of 100 subjects, 70 women and 30 men in each. We give one group nothing, the second group CBT, and the third group red pills. 0 in the first group get better, 70 in the second group—all the women— and 30 in the third group— all the men. If FDA policy is only to approve the drug that is most effective, they’ll approve CBT and disapprove of red pills, because though both are significantly better than no treatment, CBT has a 70% success rate, not a 30% rate.
For psychological conditions, especially, people differ.
As much as I’m often very critical of the FDA, I don’t think they would decide as you have outlined: My impression is that they are quite careful to analyze by sex, and often do separate trials for seniors and children. (However, the problem remains of the legality of “off-label” prescribing — but that is not the FDA’s fault.)
@Martha: by sex, yes, but by X where X is not a standard category? Especially where X is in fact unknown (perhaps a combination of Single Nucleotide Polymorphisms that hasn’t been identified yet for example?)
It also makes sense to try to identify predictive factors for which treatment will be effective in the sub-groups across the variation spectrum. In other words, stop thinking like “we’re going to identify THE treatment” and start thinking like “we’re going to identify how best to choose a treatment for a given person”.
> If a randomized controlled trial won’t satisfy me, what will?
Prof. Jonathan Edwards has called the PACE trial valueless. He argues that the combination of lack of blinding of treatments and choice of subjective primary endpoint is a fatal flaw.
I understand that this combination allows various forms of bias to affect the outcomes with no possibility to correct for it. Just like in an unblinded trial of homeopathy. I see little reason to believe that the modest effects seen are anything more than a placebo effect. Consistent with this, at the 2.5 year long term followup there was no difference between the intervention groups. It literally didn’t matter treatment they had received (and patients were still sick).
A layman’s question based on nothing but a very superficial look at the 2013 Walwyn et al. paper: are adaptive pacing and cognitive behaviour therapies really competitors? Does adaptive pacing include cognitive behaviour therapies components? From the quote above I don’t get why one can’t just combine both methods as in many cases there is no such “entirely reversible” result.
They are mutually exclusive. Pacing is based on the idea that there is an underlying illness which restricts how much a patient can do without exacerbation of symptoms. The exacerbation of symptoms is severe and pathological. The type of CBT we are talking about here is based on the idea that patients are suffering from the maladaptive belief of having an illness, which leads them to rest too much and becoming deconditioned which produces physical sensations that are false interpreted as symptoms of an illness by the patient. The CBT aims to correct what is viewed as false beliefs.
A survey of patients conducted by the ME Association in the UK showed that 74% of participants reported worsening of symptoms with GET.
+1 Whenever CBT is in play the working clinical hypothesis/dogma is that the patient has “false beliefs” that she can talk herself out of.
Can you illustrate how your style of descision making ( “cost-benifit with explicit uncertainity” ) can be applied to something like CFS?
It sounds good in the abstract but I’m not sure how that can be translated into actual policymaking.
I am also very curious about this.
Perhaps something like this?
I saw the Horton quote as a reflection of the pragmatic way in which patients with CFS are managed by UK medical staff. An important part of the biopsychosocial approach is to provide a model of illness to patients in which they are encouraged to believe that they have a condition which is reversible through rehabilitation, instead of having doctors be honest about the poor quality of research and evidence in this area. You’re quite right that PACE cannot be used to justify truth claims about the nature of CFS, but when patient cognitions are medicalised, informed consent can go out the window.
Also, I don’t think that this is true: “The key point is that they found that talk therapy and exercise worked in a randomized controlled trial”. It depends on what you mean by ‘worked’, but I don’t think that the data so far released does show that either CBT or GET led to real improvements in patients’ health. We have some evidence that being provided with these interventions, which include particular models of illness and ‘positive’ claims about the efficacy of these treatments, led to some improvement in subjective questionnaire scores (although at LTFU the scores of those who received CBT/GET were no better than those who did not), but the more objective outcomes used as part of the trial did not indicate that these interventions were of real value: http://www.bmj.com/content/350/bmj.h227/rr-10 http://www.thelancet.com/journals/lanpsy/article/PIIS2215-0366%2815%2900089-9/fulltext
I think that some of the issues addressed in this blog may be more relevant in other contexts, while in the case of the PACE trial, they’re somewhat besides the point.
I just wanted to mention something about the ‘non-blinding’ of the therapies in PACE- not only were the PACE CBT/GET ‘therapies’ not blinded, but actually actively telling study participants that a) recovery is possible and b) that CBT and GET are effective towards this goal are instead integral parts of these supposedly successful treatments! (1,2) This is yet another issue that the PACE authors are not only well aware of but have stated themselves in the past, just like the non-normally distributed curves of the ‘normal ranges’.
So not only are patients explicitly told that CBT/GET are effective and can help them ‘recover’, but they’re also told that pacing, ie adjusting to symptoms, will not help them in this regards. Given the fact that almost no objective improvements were recorded in the CBT/GET groups to correspond with the marginal improvements reported in patient self-report scores, it seems like all the authors die was document the placebo effect in action. Can you imagine if a trial on homeopathy or similar did the same thing?!
1. “If learning to cope with CFS is the jointly agreed maximal goal of treatment, patients will engage with treatment accordingly. If the therapist suggests that recovery is possible, the patient expectations are raised, which in turn may lead to a change in the perception of symptoms as well as disability. This is also the essence of the placebo response.”
Knoop H, Bleijenberg G, Gielissen M, F, M, van der Meer J, W, M, White P, D
Is a Full Recovery Possible after Cognitive Behavioural Therapy for Chronic Fatigue Syndrome?
Psychother Psychosom 2007;76:171-176
2. “Both graded exercise therapy and cognitive behavior therapy assume that recovery from chronic fatigue syndrome is possible and convey this hope more or less explicitly to patients. Adaptive pacing therapy emphasizes that chronic fatigue syndrome is a chronic condition, to which the patient has to adapt.”
Bleijenberg G, Knoop H
“Chronic fatigue syndrome: where to PACE from here?”
Lancet 2011; DOI: 10.1016/S0140-6736(11)60172-4.
I don’t agree with the wording of some of this, but I do think there is a very important point being made.
The PACE PIs went to great lengths to define adverse reactions. They distinguish between adverse events and reactions, between reactions to supplementary therapies and to the proposed interventions, between feeling temporarily worse before improving. They do not, however, define what is an effective treatment. Any improvement on any measurement is counted, but, particularly on the subjective tests, may be for any number of reasons. The patient could have a co-morbid psychological problem, may want to please the therapist, may want to convince themself, may be amenable to psychotherapy.
To move from a few modest improvements on some subjective tests to declaring something an effective treatment and from that to a conclusion about the nature of ME is absurd.
I have been really enjoying your continued PACE analysis, but I have to say I have a few bones to pick with this one. I of course understand you are not an expert in the ME/CFS field and the politics, history and medical findings are quite a bit to wade through, but hope you will indulge me to point out a few things…
In terms of PACE based therapies, the objections patients and ME clinicians have (beyond the poor science) is not just that the specific GET/CBT proposed is ineffective, but that it is in fact harmful to a great number of patients. The fact that exercise and a bit of talking therapy sound on their face to be something that couldn’t possibly cause harm is a large part of the problem.
There is a huge body of literature showing abnormal response to exertion in ME&CFS – so much so that the Institute of Medicine has proposed renaming the disease to ‘Systemic Exertion Intolerance Disease’.
For an overview of some of the exercise and muscle findings try:
Every survey of ME patients ever done has shown the majority reporting harms from GET, the recent MEA report shows 74% saying GET worsened their condition, if you review the qualitative section at the end there are countless patients (including a few from the PACE trial) describing how GET rendered them permanently bedbound or wheelchair dependent. see: http://www.meassociation.org.uk/wp-content/uploads/2015-ME-Association-Illness-Management-Report-No-decisions-about-me-without-me-30.05.15.pdf
and in fact in the few cases where Objective measures have been used in GET/CBT studies the results have been NEGATIVE for GET, even in studies that are heavily skewed to a ‘biopsychosocial’ approach: https://www.researchgate.net/publication/40846607_How_does_cognitive_behaviour_therapy_reduce_fatigue_in_patients_with_chronic_fatigue_syndrome_The_role_of_physical_activity
The Belgium government has dismantled its GET&CBT program based on results they collected showing these therapies had negative effects on employment rates and disability measures.
Tom Kindlon also has some excellent analysis of harms relating to PACE specifically :
its worth noting as well, that the PACE investigators used their own non-standard criteria to select patients (The Oxford criteria) which doesn’t require Post-exertional symptoms, neurological symptoms, or autonomic irregularities, and requires ‘fatigue’ to be the primary complaint. they then cherry picked the 600 trial subjects from a pool of over 3000. So even if their results were valid, they might be said to apply more to the symptom of chronic fatigue than to ME or CFS. Both the IOM report and the NIH P2P report have suggested these criteria (Oxford) should not be used for research as this overly broad definition is likely to “impair progress and cause harm”.
In terms of large RCTs, I’m not sure if PACE was indeed the largest, but would have to verify
I believe Ampligen trials were comparable size.
The ongoing phase-3 Rituximab trials are set to be the biggest RCT (for M.E.) , and judging by the extremely promising results from three previous trials, will hopefully put some of this debate to bed
The comments on documenting harm and on cherry picking participants are important points. Thanks for bringing them up.
what I was trying to get at with the Ampligen example, (but apparently forgot to actually put in) was that it was a good parallel to compare with PACE.
In the sense that an FDA review of Ampligen has found it to be safe, but the FDA was not convinced of efficacy despite 40% of patients reporting substantial improvement. And while there have been a lot of mis-steps in regard to Ampligen, one could argue that in this case pharmacological trials are being held to a much higher evidence standard than non-pharmacological treatments . http://www.cortjohnson.org/blog/2015/02/05/congressional-hearing-ampligen-roadblocks-fda-called/
The argument of “but its all we have” seems to be popular with the PACE investigators, but seems rather disingenuous to me on a few fronts.
1) There are a plenty of symptomatic treatments for various aspects of ME&CFS (postural tachycardia, Neurally mediated hypotension, pain, sleep, etc) that while they don’t cure the disease can dramatically improve quality of life
2) There are a number of treatments which have been effective for subgroups of patients (Ampligen, anti-virals etc)
3) in terms of day-to-day illness management patients already have a model they find safe and effective – Pacing ( which bears little resemblance to the “Adaptive Pacing therapy” tested in the PACE trial). Pacing as it is used by patients is actively opposed by PACE trial proponents
3) the PACE model and its proponents have been a huge factor in blocking research into effective treatments
Hi Andrew, I’d like to take a moment to address some of the specifics that you’ve raised in relation to the PACE trial…
> “1. It’s a randomized controlled trial, perhaps the only large study of this sort on chronic fatigue syndrome. Results are (I assume) statistically significant.”
The (self-report) primary endpoints were said to demonstrate statistical significance for CBT and GET at 52 weeks, but at 2.5 year follow-up the difference between the trial arms had disappeared. However, all of the published primary endpoints were post-hoc, and the protocol-defined primary endpoints were not reported.
There were two objectively measured assessments of physical function in the trial (a walking test and a step test). The step test demonstrated no statistically significant differences between trial arms at 52 weeks, and the walking test demonstrated a statistically significant difference for GET but no benefit from CBT. However, data for the walking test was missing for a third of GET participants at 52 weeks and, without any explanation for the missing data, it seems reasonable to question whether some participants may have avoided the test for reasons related to capacity to participate, thus inflating the available data for the GET group.
> “2. This is the gold standard: a statistically significant effect in a randomized trial. I’ll assume for now that the design and analysis were clean. There’s some dispute about this, in particular regarding the appropriateness of the outcome measures, but for now let’s just take the empirical clinical finding as true. Just speaking generally, CBT and exercise therapy are both good things, so it is certainly plausible that they would help on average this group too.”
The trial was open-label and lacked a placebo control group. It used self-report measures (only) for the primary outcomes. Also, all of the primary outcomes were post-hoc (i.e. switched after the protocol was published), and the protocol-defined outcomes were not reported. For these reasons, and others, I question whether it can be described as a ‘gold standard’ trial.
> “3. Evidence-based medicine. If CBT and exercise therapy are the only proven treatments for chronic fatigue syndrome, then I see the argument for (a) recommending these treatments, and (b) being reluctant to pay for unproven alternatives.”
CBT and GET had a moderate effect size when using the (post-hoc and self-report) primary outcome measures at 52 weeks but there was no significant effect at 2.5 years. I question whether publishing ‘post-hoc’ endpoints, only, can be considered adequate evidence to indicate a clinical benefit.
> “4. Conclusions about how the disease works. If talk therapy cures it, the condition must be psychosomatic. If exercise therapy works, it must be that the people with this condition could get better, if they didn’t have these mental blocks.”
Just to point out that CBT and GET were intended to improve physical function, but they did not improve objectively measured fitness, which was one of the two objective measures of physical function. CBT also did not improve walking capacity on a six minute walking test, which was the other objective measure. GET marginally improved outcomes on the six minute walking test, but the average participant had severe impairment at the end of the trial, and data was missing for a third of participants at 52 weeks without explanation.
CBT or GET were hardly a cure. At 2.5 year follow-up, there were no differences between trial arms, so the treatments were not beneficial in the long-term, let alone a cure.
> “1. Chronic fatigue syndrome is a diverse condition, or set of conditions lumped under a common diagnosis. It is completely reasonable to think that different therapies could work for different people, and that the condition has different sources for different people. Indeed, even the PACE trial itself found many people did not improve under their treatments.”
Indeed, subgrouping would be very useful in ME/CFS. The proportion of patients who responded to CBT or GET (when added to SMC), at 52 weeks, was 11-15%, so only a small minority of patients responded to CBT/GET. But, again, this is using post-hoc self-report endpoints in an open-label trial without a placebo control. At 2.5 years there was no difference between trial arms.
Thanks, this is very succinct and helpful.
I shudder every time I hear the phrase “gold standard” used (as it often is) to refer to just one criterion (e.g., “randomized”, or “randomized controlled”). A truly good study needs to meet lots of criteria.
Andrew – Thank you for following up on this & making good points. But if you look closely at the PACE papers, you’ll see there was no significant difference in outcomes between the four arms of the trial in terms of objective measures like work hours & walking distance (see http://www.thelancet.com/pdfs/journals/lanpsy/PIIS2215-0366(15)00089-9.pdf). That means the interventions – CBT and GET – were NO MORE EFFECTIVE than usual care. And yet the authors claim otherwise – a claim repeated so many times it’s taken on an air of truthiness.
I am in a twilight zone. The more I read about PACE the more I get drawn in politics and I don’t know how we went from this to the reality that the patients are experiencing. You call it a therapy which somehow translated in reality to people being denied the most basic help for things that have solution, and instead of them receiving help they end up locked up in a hospital in worst case scenario and being ignored and untreated for most. When in fact most symptoms are demonstrable and treatable.
Is so simple for doctors to listen to their patients and just treat the most debilitating symptoms.
Is the translational to practice issue to PACE and the psy view of psychosomatic, and leave the patient to swim in their own crap LITERALLY. People starving because they cannot feed themselves, this is the real issue and you are all dancing around it. This is insanity. Forget about numbers, statistics and the whole she said he said. PATIENTS ARE STARVING. PATIENTS ARE BEING DENIED HELP somehow based on this study!!!!
The issue with the psy position is that the patient is not sick. And the poor patient is left to their own devices. I am fortunate that my doctor took me seriously and I get treatment and I can lead a somehow normal live (family, work full time….) I am still sick but I can manage my live.
if your patient say:
I cant sleep : do a sleep study you will see the sleep stages issues and medicate accordingly.
I faint or feel bad when I stand: Do a tilt table test and treat Orthostatic intolerance.
I feel tired and sick: run a immune profile and treat coinfections. Also low natural killer cell and activity.
It is not that hard.
Andrew, thanks for an interesting post. I think what we have here in the PACE trial is an expensive demonstration as to how the manipulation of expectation can bias responding on self-report measures. As John mentions, participants in the treatment groups of interest were actually told at the beginning of the trial that these treatments would make them better!
from the CBT participants’ manual (Burgess & Chalder, 2004, p. 123):
– (CBT is) “a powerful and safe treatment which has been shown to be effective in … CFS/ME”
– “many people have successfully overcome CFS/ME using cognitive behaviour therapy, and have maintained and consolidated their improvement once treatment has ended”
from the GET participants’ manual (Bavinton, Dyer & White, 2004, p. 28):
– “in previous research studies, most people with CFS/ME felt either ‘much better’ or ‘very much better’ with GET”,
– (GET is) “one of the most effective therapy strategies currently known”
Of course, participants in the other treatment arms did not receive any such recommendations.
I would like to think that even in psychotherapy, where conventional double blinding is not possible, that we could do a little better at controlling for expectation-related reporting biases.
PACE raises important questions about research design standards in all psychotherapy studies. And its time we started addressing those questions.
This point can’t be made often enough. I just can’t understand how this thing got £5m ($8m)of public money spent on it.
I do wonder if it’s somehow “too big to fail”, like the banks in the economic crisis. I suspect that people think that a trial that was so big and expensive and funded by a government agency can’t possibly have such an inherently poor design, but it does.
£5m of nothing. What a waste.
The PACE trial could have been useful scientifically. But it was never designed that way. As revealed by the comments made by Richard Horton, the PACE trial was designed to justify resource allocation towards mental health services for these patients, rather than further the field scientifically.
Those scientific questions that it could have answered were:
-Which patients are likely to respond?
Based on previous trials, the investigators could have formed several a-priori hypotheses to try and predict from baseline data, whom is likely to respond (and hence more effective allocation of health resources). Instead, they adopted the yes/no thinking that Andrew Gelman mentions – their attitude is that the treatments are appropriate and effective for all patients, rather than specific subgroups. Then they tried to do post-hoc mediation analyses – but three studies based on the same data, with three different conclusions is hardly impressive.
-Does the subjective self-reporting of symptom improvement actually translate into objective measures of reduced disability?
Several objective outcomes (actigraphy, neuropsychiatric testing) were suggested by the patient groups that were consulted, but rejected by the trial as too burdensome!?!
In the end, the PACE trial did not add anything that had not already been shown in earlier trials, except for the smaller effect size. The only innovation was a large sample size, which was offset by the comparatively smaller effect size found in this trial compared to previous trials.
its good that people have commented here about why they think the pace trial was flawed but in the end its scientists themselves who have come out and openly critisise the Pace trial.
TRIAL BY ERROR: The Troubling Case of the PACE Chronic Fatigue Syndrome Study
21 October 2015
By David Tuller, DrPH
Experts who have examined the PACE study say it is fraught with problems.
“I’m shocked that the Lancet published it,” said Ronald Davis, a well-known geneticist at Stanford University and the director of the scientific advisory board of the Open Medicine Foundation. The foundation, whose board also includes three Nobel laureates, supports research on ME/CFS and is currently focused on identifying an accurate biomarker for the illness.
“The PACE study has so many flaws and there are so many questions you’d want to ask about it that I don’t understand how it got through any kind of peer review,” added Davis, who became involved in the field after his son became severely ill. “Maybe The Lancet picked reviewers who agreed with the authors and raved about the paper, and the journal went along without digging into the details.”
In an e-mail interview, DePaul University psychology professor Leonard Jason, an expert on the illness, said the study’s statistical anomalies were hard to overlook. “The PACE authors should have reduced the kind of blatant methodological lapses that can impugn the credibility of the research, such as having overlapping recovery and entry/disability criteria,” wrote Jason, a prolific researcher widely respected among scientists, health officials and patients.
Jason, who was himself diagnosed with the illness in the early 1990s, also noted that researchers cannot simply ignore their own assurances that they will follow specific ethical guidelines. “If you’ve promised to disclose conflicts of interest by promising to follow a protocol, you can’t just decide not to do it,” he said.
Jonathan Edwards, a professor emeritus of connective tissue medicine from University College London, pioneered a novel rheumatoid arthritis treatment in a large clinical trial published in the New England Journal of Medicine in 2004. For the last couple of years, he has been involved in organizing clinical trial research to test the same drug, rituximab, for chronic fatigue syndrome, which shares traits with rheumatoid arthritis and other autoimmune disorders.
When he first read the Lancet paper, Edwards was taken aback: Not only did the trial rely on subjective measures, but participants and therapists all knew which treatment was being administered, unlike in a double-blinded trial. This unblinded design made PACE particularly vulnerable to generating biased results, said Edwards in a phone interview, adding that the newsletter testimonials and other methodological flaws only made things worse.
“It’s a mass of un-interpretability to me,” said Edwards, who last year called the PACE results “valueless” in publicly posted comments. “Within the circle who are involved in this field, it seems there were a group who were prepared to all sing by the hymn sheet and agree that PACE was wonderful. But all the issues with the trial are extremely worrying, making interpretation of the clinical significance of the findings more or less impossible.”
Bruce Levin, a professor of biostatistics at Columbia University and an expert in clinical trial design, said that unplanned, post-protocol changes in primary outcomes should be made only when absolutely necessary, and that any such changes inevitably raised questions about interpretation of the results. In any event, he added, it would never be acceptable for such revisions to include “normal range” or “recovery” thresholds that overlapped with the study’s entry criteria.
“I have never seen a trial design where eligibility requirements for a disease alone would qualify some patients for having had a successful treatment,” said Levin, who has been involved in research on the illness and has reviewed the PACE study. “It calls into question the diagnosis of an illness whose patients already rate as ‘recovered’ or ‘within normal range.’ I find it nearly inconceivable that a trial’s data monitoring committee would have approved such a protocol problem if they were aware of it.”
Levin also said the mid-trial publication of the newsletter featuring participant testimonials and positive news about interventions under investigation created legitimate concerns that subsequent responses might have been biased, especially in an unblinded study with subjective outcomes like PACE.
“It is highly inappropriate to publish anything during an ongoing clinical trial,” said Levin. “To let participants know that interventions have been selected by a government committee ‘based on the best available evidence’ strikes me as the height of clinical trial amateurism.”
At the least, the PACE researchers should have evaluated the responses from before and afterwards to assess any resulting bias, he added.
Recent U.S. government reports have raised further challenges for the PACE approach. In June, a panel convened by the National Institutes of Health recommended that researchers abandon a core aspect of the PACE trial design—its method of identifying participants through the single symptom of prolonged fatigue, rather than a more detailed set of criteria. This method, the panel’s report noted, could “impair progress and cause harm” because it identifies people with many fatiguing conditions, making it hard to interpret the findings.
Thank you for writing this essay.
I’ll leave aside the internal flaws in the PACE study, which were much better explained by David Tuller on Columbia Virologist Vince Racaniello’s blog – here is a link to the first of three essays in total: http://www.virology.ws/2015/10/21/trial-by-error-i/
What I find much more troubling is the absence of references to research that would contradict the claim that there are only two options available: CBT/GET and coping. It is as if the psychiatric literature on ME and CFS existed in a parallel universe. No one reading Wessely, White, Sharpe or Chalder would have any idea that there is a robust body of literature on biomedical factors: abnormal SPECT scans, CPET scores, cytokines, natural killer cell function, RNase-L defect, orthostatic intolerance, viral infections, and more. There is also a robust body of literature on treatments, including immune therapies and antivirals.
The authors could, reasonably, say that many of the published studies are small sample. This may be true, but unsurprising, because there has been very little funding allocated by either the US NIH or the British NHS for biomedical research into either M.E. or “CFS.”
However, given that the PACE study was supposed to be “definitive,” I don’t think it is asking too much to acknowledge that there IS ongoing research on non-psychiatric explanations and treatments. The absence of any reference to studies that might cast doubt on their own thesis rather obviously biases the reader. Given that the PACE study is to be used for policy, that bias is inexcusable.
In contrast, look at the bibliography for the recent IOM study by the US Institute of Medicine, commissioned by the NIH:
The proponents of CBT/GET as a “cure” for ME or CFS often respond to critics with the charge that their critics just don’t “like” psychiatry as a discipline – and that they are guilty of “Cartesian mind/body dualism.” But if you look at the references in the 2011 PACE study report in Lancet and compare them with the bibliography in the IOM study, it is hard not to conclude that the real practitioners of mind/body dualism are Wessely, White, Sharpe, and Chalder.
What happens when a complex problem is reduced to a simple one by presenting only two options? I know the statistical answer: this model is seriously misspecified. Any mumbo-jumbo about statistical analysis is meaningless, because important variables have not even been considered – I say important, because they are in the peer-reviewed published literature.
This study was commissioned by the British government. What then does it mean that peer-reviewed published journal articles with information important to decision-making is completely ignored as if it never happened at all?
The problem is not just thinking in dichotomies in this case – it is that M.E. is a multi-systemic, complex disease. To reduce it such that it can be portrayed in such a simplistic fashion is to imagine it out of existence. An outcome that is probably the most cost-effective of all.
You’re certainly right that there are flaws in the way that research gets translated into policy. It appears all too easy to get something worryingly bad like this made into key policy that affects millions of people’s lives despite it’s plethora of design, implementation and reporting failings. PACE is a prime example of where agencies, who should be evaluating research objectively, simply give a piece of research a free pass. Simply put, no one looked to see if their analysis such as for recovery actually checked out and no one required access to the data to put their claims to the test, it was all just taken at face value. That should never happen and everyone should be concerned that it has. Some of the factors that I think helped to create this situation:
1. The history of research funding into this disease in the UK is one of an almost complete monopoly into behaviour studies and, as a result there is a pervasive view of the disease as being a behavioural disease in this country. So when a study comes out and says just that, no one (except for those who actually know the disease) is surprised. No one sees the need to scrutinize something properly when it apparently confirms people’s existing beliefs and biases. The battle of ideas won’t be won until we have firm biomedical answers, but the researchers undertaking promising work of this kind, such as Ian Lipkin (like you, at Columbia) who is unequivically of the view that the disease is not behavioural, are quite rightly focused on doing their much needed biomedical research, not tackling the flawed behavioural research from the UK. Nevertheless, PACE should not get a free pass because of this situation, all research no matter its type must meet minimmum scientific standards and the fact that PACE doesn’t should mean it should be discounted.
2. The fact the study authors are high status individuals, and others who supported their study are likewise very influential, this certainly helped maintain the illusion of it being a sound study that didn’t require the usual scrutiny.
3. When patients voiced concerns they were actively discredited on several platforms which had the effect of making criticism appear an extremist view that no one should pay proper attention to. The authors of PACE would clearly prefer a situation where they tell people what patients want themselves and the patients should shut up. As the research is supposedly about patients, this situation where patients are marginalized and portrayed as unreliable witnesses of their own disease, is a problem and it is definately a big factor here.
4. The PACE trial got a lot of press coverage and it’s no coincidence that major supporters of the PACE trial are heavily involved in the Science Media Centre who pulled out all the stops to say how great the study was. Anyone critical of the study was not given a voice and the press printed what they were given by the SMC, reinforcing the illusion more broadly.
5. As you pointed out in your previous blog post, PACE was published (and has continued to be defended by) a prestigious journal. If you can’t trust The Lancet, who can you trust? I mean, surely they know what they’re doing, so you can just accept it at face value….
Even in the current environment where PACE is being talked about at last, people are still making all kinds of dangerous assumptions about the trial, for the above reasons, and due to the reality that everyone is busy – and assumptions help speed everything up. Except they don’t if those assumptions are wrong. If a person sits down with the PACE trial with a clean slate, forgetting all those assumptions, the study looks at best amateur and at worst dishonest.
One final thing. Something I wanted to pick up from your text above is the statement that CBT is generally a good thing. Exercise I can agree (generally, though in CFS it is catastrophically not the case) but importantly the type of CBT in PACE is not the usual CBT that most people are familiar with. It is aimed solely at convincing the patient that their symptoms are not real and to disregard them. Apply that to other diseases and you can see straight away that this would not be generally good in most cases.
“If a person sits down with the PACE trial with a clean slate, forgetting all those assumptions, the study looks at best amateur and at worst dishonest.” Even more so if you compare the pre-specified outcomes and analyses in the protocol with what ended up in the Lancet. http://bmcneurol.biomedcentral.com/articles/10.1186/1471-2377-7-6
I wonder if the baubles of the clincial trial “large”, “randomized”, “carefully designed”, etc dazzle clinicians into deterministic thinking. The higher the internal validity… the worse the final interpretation. Removing one set of biases… only to add another.
Some decades ago a coworker of mine got CFS from mononucleosis; some months later he got another virus that `cured’ him. His doctor told him this was common; an unknown virus infects a patient with mono induced fatigue syndrome and somehow cures it.
Many decades ago in the 1960’s I worked summers in the Cincinnati Medical Computing Center, and one of the aphorisms circulating then was `diagnosis is easy, treatment is hard’, meaning that labeling a disease was easy but selecting the right treatment among many choices for a particular patient was hard.
What I take from this is that CFS is a cluster of diseases, and not a single disease, and in some cases it has both a viral cause and oddly enough a viral cure.
There is a growing belief that CFS is indeed likely a cluster of diseases and that’s the whole problem in a nutshell. For some patients, they have a sudden, ‘flu-like’ onset where they get violently ill and never really recover. They report sore throat, swollen lymph nodes, weird neurological sequelae (one that I distinctly remember reading about was a person reporting that when they were driving and coming to a stop light, they couldn’t remember whether green meant to stop or go), cognitive dysfunction, headache, sleep dysfunction, exertion-induced debility, etc. A lot of these patients can even tell you the time of day and what they were doing when they became ill.
For other patients such as myself and my sister however, I had an exceedingly gradual onset with a progressive disease course, constant headache since 2003 and housebound since 2004. I also have exertion-induced debility, sleep dysfunction, etc. I’ve never had sore throat, swollen lymph nodes or neurological sequelae however (other than headache and cognitive difficulties). While we share some of the same key symptoms, the discrepancy with onset and other symptoms leads me to think that there are either different diseases or at the very least subgroups of the same disease. One interesting thing about the ‘viral cure’ you mention is that several years ago I caught a virus from my father which consisted of nothing but a dry, hacking cough and a tickle in the back of my throat and lasted for about two weeks and the entire two weeks my head felt wonderful, like I was on codeine. When the cold went away my headache came back where it’s been ever since. That’s the only time anything like that has ever happened to me.
The problem about CFS being a cluster of diseases arises when both? all? kinds of patients are studied in a research setting- say if you had two groups with 50 patients each, Group A and Group B, for a total of 100 patients. Say 40 of the 50 patients in Group A had a shared abnormality, response to treatment, etc, while 45 of the 50 in Group B had a different abnormality, response to treatment, etc. This would mean that 80% of Group A shared a common finding while 90% of Group B shared a different finding, both of which are pretty good numbers. However if you don’t subgroup the participants into the appropriate groups, you end up with 40% and 45% instead of 80% and 90%, which is a common occurance in CFS research studies. When the given abnormalities aren’t present in enough patients to be of diagnostic significance they get thrown aside into the bin and researchers start all over again with a new study, new group of un-subtyped patients, etc. and the wheel goes round again.
This leads to the common misnomer that ‘there are no abnormalities present’ in ME/CFS when the fact is that there are all kinds of abnormalities that have been reported in too many studies to count, it’s just that because the abnormalities haven’t been reported in all patients, then this means that a diagnostic abnormality has not been found. The problem is that until and unless a way is found to subtype patients, a ‘diagnostic abnormality’, ie a finding common to all patients, will likely never be found since all patients diagnosed with ME/CFS might well not even share the same disease! It’s been said that looking for a common cause to ME/CFS is like looking for a common cause of jaundice or fever.
+1, especially the last sentence. That is in line with my gut instinct that ME/CFS (as now defined) might better be described as a “condition” (such as jaundice or fever) than a “disease.”
There are countless examples of either/or type of thinking in medical research studies. (In rhetoric, it is known as the “false dilemma” fallacy.) To understand how the basic underlying assumption of the PACE trial has failed, it may help to re-contextualize it in terms of another disease.
In 2008 the Journal of the Medical Association of Thailand (peer-reviewed) published a study showing that meditation has a post-prandial hypoglycemic effect in diabetics. Meditation is known to reduce anxiety. Therefore, one may conclude that diabetes is psychogenic.
According to the NIH, diabetes is an organic condition caused by a variety of factors, including genetic susceptibility, endocrine disorders and metabolic syndrome. While it can be managed, diabetes has no cure.
(To quote Horton, which one is right?)
No doctor in his or her right mind would suggest that a diabetic patient cease all treatment in favor of “thought management.” Yet, that is exactly the recommendation of PACE trial authors to ME/CFS patients, based an a study that has as much inferential value as the Thai meditation research.
Thanks for your interest in this issue. There is an urgent need for some independent senior researchers and academics to take a long hard look at it.
Two points (and apologies if they have already been made, I have not read all the comments).
“The PACE study is a randomized controlled trial…
This is the gold standard:…”
PACE was not a controlled trial, therefore not a gold standard trial. It describes itself only as “randomised”. My understanding is that the PACE authors originally also described it as controlled, but the reviewers rejected that claim.
“All treatments also included specialist medical care.”
Specialist Medical Care (SMC) also had an arm of its own, giving four arms in total (CBT + SMC, GET + SMC, APT + SMC, and SMC only).
This article is linked to in the comments:
“Afflicted by studies with small sample sizes, tiny effects, invalid exploratory analyses, and flagrant conflicts of interest, together with an obsession for pursuing fashionable trends of dubious importance, science has taken a turn towards darkness. As one participant put it, “poor methods get results”.”
The last phrase gets it exactly correct. When the null hypothesis is of no difference, the worse you run the experiment the more likely you are to “get results”. This is, of course, the problem identified by Paul Meehl regarding psychology in the 1960s, which has spread throughout the research community, including medicine, leaving every area that adopts it ravaged with misinformation.
Sorry, I was reading the comments here:
“Just speaking generally, CBT and exercise therapy are both good things, so it is certainly plausible that they would help on average this group too.”
Generally, are CBT and GET, as provided to CFS patients, good things? They are not minor interventions, and can have a profound impact upon people’s lives in a wide range of difficult to predict ways. It’s important not to move from the assumption that generally, thinking clearly and exercise are good things, to assuming that generally CBT and GET are good things.
For anyone interested in more of the details on this, a UK patient charity recently surveyed patients and the testimonies in appendix 1 of their report, starting on page 98, illustrate the range of ways in which these interventions can be experienced. http://www.meassociation.org.uk/2015/05/23959/
For the researcher’s perspective, the PACE trial also provided copies of the CBT/GET manuals here: http://www.wolfson.qmul.ac.uk/current-projects/pace-trial#trial-information
FINE, described as PACE’s sister trial, assessed Pragmatic Rehabilitation, which combines aspects of CBT and GET, and information on this intervention and the claims made to patients is available here: http://www.fine-trial.net/gparea.asp?loggedin=1
I’m not saying it’s obvious that these treatments should work. I’m just saying that I’m not surprised that these treatments should have a net average benefit in that I’d expect them to work for some subset of this diverse population of CFS sufferers.
Andrew: I think that I understood what you were trying to say, it’s just that after looking at the specifics of these interventions, it became difficult to make reasonable assumptions about whether they would be more likely to cause net harm or net benefit. There is a high cost to engaging in them, so they would need to bring substantial benefit to a large subset to lead to a net average benefit. When I knew less about CBT and GET for CFS I would say things like you are, based upon my failure to consider and understand the costs associated with these interventions, and it could be that my past is leading me to wrongly assume your reasoning is similar to what mine was. You have not really said why you think these interventions are generally good things.
Also, the testimonies from patients make clear that the impact of these interventions needs to be understood in the context of the claims made to patients about their efficacy. If these interventions are no more effective than placebo, yet are being sold to patients as being more than that, then that this is an important harm in itself.
Sure, fair enough.
The PACE Trial Gets Its Most Devastating Critique Yet from Dr Rebecca Goldin, Professor of Mathematical Sciences at George Mason University and Director of STATS.org.
She writes, “The study is under increasing scrutiny by scientists and science writers about whether its conclusions are valid. The question of how all this happened and how the criticism is being handled have sent shockwaves through medicine”.
She added, “The results from PACE… have been published in prestigious journals and influenced public health recommendations around the world; and yet, unraveling this design and the characterization of the outcomes of the trial has left many people, including me, unsure this study has any scientific merit. How did the study go unchallenged for five years?”