Pro-PACE, anti-PACE

Posted on January 13, 2016 9:56 AM by Andrew

Pro

Simon Wessely, a psychiatrist who has done research on chronic fatigue syndrome, pointed me to an overview of the PACE trial written by its organizers, Peter White, Trudie Chalder, and Michael Sharpe, and also to this post of his from November, coming to the defense of the much-maligned PACE study:

Nothing as complex as a multi-centre trial (there were six centres involved), that recruited 641 people, delivered thousands of hours of treatment, and managed to track nearly all of them a year later, can ever be without some faults. But this trial was a landmark in behavioural complex intervention studies. . . .

I have previously made it clear that I [Wessely] think that PACE was a good trial; I once described it as a thing of beauty. In this blog I will describe why I still think that . . . Here is a recent response to criticisms, few of them new.

He provides some background on his general views:

CFS is a genuine illness, can cause severe disability and distress, affects not just patients but their families and indeed wider society, as it predominantly affects working age adults, and its cause, or more likely causes, remains fundamentally unknown. I do not think that chronic fatigue syndrome is “all in the mind”, whatever that means, and nor do the PACE investigators. I do think that, as with most illnesses, of whatever nature, psychological and social factors can be important in understanding illness and helping patients recover. Like many of the PACE team, I have run a clinic for patients with chronic fatigue syndrome for many years. Like the PACE investigators, I have also in the past done research into the biological nature of the illness; research that has indicated some of the biological abnormalities that have been found repeatedly in CFS.

And now on to the trial itself:

The PACE trial randomly allocated 641 patients with chronic fatigue syndrome, recruited in six clinics across the UK . . . What were its main findings? These were simple:

That both cognitive behaviour therapy (CBT) and graded exercise therapy (GET) improved fatigue and physical function more than either adaptive pacing therapy (APT) or specialist medical care (SMC) a year after entering the trial.

All four treatments were equally safe.

These findings are consistent with previous trials (and there are also more trials in the pipeline), but PACE, because of its sheer size, has attracted the most publicity, both good and bad.

Wessely continues:

What makes a good trial and how does PACE measure up?

Far and away the most important is allocation concealment; the ability of investigators/patients to influence the randomisation process . . . No one has criticised allocation concealment in PACE, it was exemplary. . . .

Next comes power. . . . Predetermined sample size calculations showed it [PACE] had plenty of power to detect clinically significant differences. It was one of the largest behavioural or psychological medicine trials ever undertaken. No one has criticised its size.

The next thing that can jeopardise the integrity of a trial is major losses to follow up . . . The key end point in PACE was pre-defined as the one year follow up. 95% of patients provided follow up data at this stage. I am unaware of any large scale behavioural medicine trial that has exceeded this. Again, no one has questioned this . . .

Next comes treatment infidelity, which is where participants do not get the treatment they were allocated to. . . . At the end of the trial, two independent scrutineers, masked to treatment allocation, both rated over 90% of the randomly chosen 62 sessions they listened to as the allocated therapy. Only one session was thought by both scrutineers not to be the right therapy. Again, no criticism has been made on the basis of therapy infidelity.

Analytical bias. The analytical protocol was predetermined (before the analysis started) and published. Two statisticians were involved in the analysis, blind to treatment group until the analysis was completed and signed off. So again, the chances of bias being introduced at this stage are also negligible.

Post-hoc sub-group analysis (fishing for significant differences) . . . here were no post-hoc sub-group analyses in the main outcome paper. A couple of sub-group post-hoc analyses were done in follow up publications, and clearly identified as such and appropriate cautions issued. None concerned the main outcomes. Again, no one has raised the issue of sub-group analyses.

Blinding. PACE was not blinded; the therapists and patients knew what treatments were being given, which would be hard to avoid. This has been raised by several critics, and of course is true. It could hardly be otherwise; therapists knew they were delivering APT, or CBT or whatever, and patients knew what they were receiving. This is not unique to PACE. . . . Did this matter? One way is to see whether there were differences in what patients thought of the treatment, to which they were allocated, before they started them. . . . And that did happen in the PACE trial itself. One therapy was rated beforehand by patients as being less likely to be helpful, but that treatment was CBT. In the event, CBT came out as one of the two treatments that did perform better. If it had been the other way round; that CBT had been favoured over the other three, then that would have been a problem. But as it is, CBT actually had a higher mountain to climb, not a smaller one, compared to the others.

He summarizes:

So far then, I would suggest that PACE has passed the main challenges to the integrity of a trial with flying colours. . . . For example, the two most recent systematic reviews in this field rated PACE as good quality, with a low risk of bias.

On this last point he gives two references:

Larun L, Brurberg KG, Odgaard-Jensen J, Price JR. (2015) Exercise therapy for chronic fatigue syndrome. Cochrane Database of Systematic Reviews 2015, Issue 2. Art. No.: CD003200. DOI: 10.1002/14651858.CD003200.pub3.

Smith MB et al. (2015) Treatment of Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: A Systematic Review for a National Institutes of Health Pathways to Prevention Workshop. Ann Intern Med. 162: 841-850. doi: https://dx.doi.org/10.7326/M15-0114.

I have a question about the “analytical bias” thing mentioned above. Recall what Julie Rehmeyer wrote:

The study participants hadn’t significantly improved on any of the team’s chosen objective measures: They weren’t able to get back to work or get off welfare, they didn’t get more fit, and their ability to walk barely improved. Though the PACE researchers had chosen these measures at the start of the experiment, once they’d analyzed their data, they dismissed them as irrelevant or not objective after all.

This doesn’t sound like a predetermined analytical protocol, so I’m not sure what’s up with that. (Let me emphasize at this point that I’ve published hundreds of statistical analyses, maybe thousands, and have preregistered almost none of them. So I’m not saying that a predetermined analytical protocol is necessary or a good idea, just saying that there seems to be a question of whether this particular analysis was really chosen ahead of time.

Here’s what Wessely says in his post:

The researchers changed the way they scored and analysed the primary outcomes from the original protocol. The actual outcome measures did not change, but it is true that the investigators changed the way that fatigue was scored from one method to another (both methods have been described before and both are regularly used by other researchers) in order to provide a better measure of change (one method gives a maximum score of 11, the other 33). How the two primary outcomes (fatigue and physical function) were analysed was also changed from using a more complex measure, which combined two ways to measure improvement, to a simple comparison of mean (average) scores. This is a better way to see which treatment works best, and made the main findings easier to understand and interpret. This was all done before the investigators were aware of outcomes and before the statisticians started the analysis of outcomes.

There seems to be some dispute here: is it just that there was an average improvement but, when you look at each part of the total score, the difference is not statistically significant. In this case I would think it makes sense to average.

Wessely then puts it all into perspective:

Were the results maverick? Did PACE report the opposite to what has gone before or happened since? The answer is no. It is a part of a jigsaw (admittedly the biggest piece) but the picture it paints fits with the other pieces. I think that we can have confidence in the principal findings of PACE, which to repeat, are that two therapies (CBT and GET) are superior to adaptive pacing or standard medical treatment, when it comes to generating improvement in patients with chronic fatigue syndrome, and that all these approaches are safe. . . .

I [Wessely] think this trial is the best evidence we have so far that there are two treatments that can provide some hope for improvement for people with chronic fatigue syndrome. Furthermore the treatments are safe, so long as they are provided by trained appropriate therapists who are properly supervised and in a way that is appropriate to each patient. These treatments are not “exercise and positive thinking” as one newspaper unfortunately termed it; these are sophisticated, collaborative therapies between a patient and a professional.

But . . .

Having said that, there were a significant number of patients who did not improve with these treatments. Some patients deteriorated, but this seems to be the nature of the illness, rather than related to a particular treatment. . . .

PACE or no PACE, we need more research to provide treatments for those who do not respond to presently available treatments.

Anti

All of the above seemed reasonable to me, so then I followed the link to the open letter by Davis, Edwards, Jason, Levin, Racaniello, and Reingold criticizing PACE.

The key statistical concerns of Davis et al. were (1) a mismatch between the intake criteria and the outcome measures, so that it seems possible to have gotten worse during the period of the study but be recorded as improving, and the changing of the outcome measures in the middle of the study.

Regarding point (1), Wessely points out that with a randomized trial any misclassifications should cancel across the study arms. Given that the original PACE article reported changes in continuous outcome measures, I think the definition of whether a patient is “in the normal range” should be a side issue. To put it another way: I think it makes sense to model the continuous data and then post-process the inferences to make statements about normal ranges, etc.

Point (2) seems to relate to the dispute above, in which Wessely said the change was done “before the investigators were aware of outcomes,” but which Davis et al. write “is of particular concern in an unblinded trial like PACE, in which outcome trends are often apparent long before outcome data are seen.” I’m not quite sure what to say here; ultimately I’m more concerned about which summary makes sense rather than about which was chosen ahead of time. It could make sense to fit a multilevel model if there is a concern about averaging. But, realistically, I’m guessing that the study is large enough to detect averages but not large enough to get much detail beyond that—at least not without using some qualitative information from the clinicians and patients.

Davis et al. also write:

The PACE investigators based their claims of treatment success solely on their subjective outcomes. In the Lancet paper, the results of a six-minute walking test—described in the protocol as “an objective measure of physical capacity”—did not support such claims, notwithstanding the minimal gains in one arm. In subsequent comments in another journal, the investigators dismissed the walking-test results as irrelevant, non-objective and fraught with limitations. All the other objective measures in PACE, presented in other journals, also failed. The results of one objective measure, the fitness step-test, were provided in a 2015 paper in The Lancet Psychiatry, but only in the form of a tiny graph. A request for the step-test data used to create the graph was rejected as “vexatious.”

I’m not quite sure what to think about this: perhaps there was a small but not statistically significant difference for each separate outcome, but a statistically significant difference for the average? If so, then I would think it would make sense to ok to report success based on the average.

I also asked Wessely about the above quote, and he wrote: “There was a significant improvement in the walking test after graded exercise therapy, which was not matched by any other treatment arm, and this was reported in the primary paper (White et al, 2011) and certainly not regarded as irrelevant.” So I guess the next step is to find the subsequent comments in the other journal where the investigators dismissed the walking-test result as irrelevant.

And I disagree, of course, the decision of the investigators not to share the step-test data. Whassup with that? This is one reason I prefer to have data posted online rather than sent by request, then anyone can get the data and there doesn’t have to be anything personal involved.

Davis et al. conclude:

We therefore urge The Lancet to seek an independent re-analysis of the individual-level PACE trial data, with appropriate sensitivity analyses, from highly respected reviewers with extensive expertise in statistics and study design. The reviewers should be from outside the U.K. and outside the domains of psychiatry and psychological medicine. They should also be completely independent of, and have no conflicts of interests involving, the PACE investigators and the funders of the trial.

This seems reasonable to me, and not in contradiction with the points that Wessely made. Indeed, when I asked Wessely what he thought of this, he replied that an independent review group in a different country had already re-analyzed some of the data and would be publishing something soon. So maybe we’re closer to convergence on this particular study than it seemed.

From the results of the study to the summary and the general recommendations

One thing I liked about Wessely’s post was his moderation in summarizing the study’s results and its implications. He reports that in the study his preferred treatment outperformed the alternative, but he recognizes that, for many (most?) people, none of these treatments do much. Wessely points out that this is not just his view; he quotes this from the original article by White et al.: “Our finding that studied treatments were only moderately effective also suggests research into more effective treatments is needed. The effectiveness of behavioural treatments does not imply that the condition is psychological in nature.”

Some questions that come to me are: Can we say that different treatments work for different people? Would we have some way of telling which treatment to try on which people? Are there some treatments that should be ruled out entirely? One of the concerns of the PACE critics is that the study is being used to deny social welfare payments to people with chronic physical illness.

And one of the criticisms of PACE coming from Davis et al. has to do with reporting of results:

In an accompanying Lancet commentary, colleagues of the PACE team defined participants who met these expansive “normal ranges” as having achieved a “strict criterion for recovery.” The PACE authors reviewed this commentary before publication.

This commentary seems to be a mistake, in that later correspondence, the PACE authors wrote, “It is important to clarify that our paper did not report on recovery; we will address this in a future publication.” That was a few years ago; the future has happened; and I guess recovery was not so easy to assess. This happens a lot in research: early success, big plans, but then slow progress. Certainly not unique to this project.

From my perspective, when I wrote about the PACE study hurting the reputation of the Lancet, I was thinking not so much of the particular flaws of the original report, or even of that incomprehensible mediation analysis that was later published (after all, you can do an incomprehensible mediation analysis of anything; just because someone does a bad analysis, it doesn’t mean there’s no pony there), but rather the Lancet editor’s aggressive defense and the difficulty that outsiders seemed to have in getting the data. According to Wessely, though, the study organizers will be sharing the data, they just need to deal with confidentiality issues. So maybe part of it is the journal editor’s communication problems, a bit of unnecessary promotion and aggression on the part of Richard Horton.

To get back to the treatments: Again, it’s no surprise that CBT and exercise therapy can help people. The success of these therapies for some percentage of people, does not at all contradict the idea that many others need a lot more, nor does it provide much support for the idea that “fear avoidance beliefs” are holding back people with chronic fatigue syndrome. So on the substance—setting aside the PACE trial itself—it seems to me that Wessely and the critics of that study are not so far apart.

78 thoughts on “Pro-PACE, anti-PACE”

Anna on January 13, 2016 10:22 AM at 10:22 am said:

sAD TO READ THAT YPU ARE UNSURE AND NOW HAVE COME TO TRHYINK THAT EXCERCISE IS A CURE FOR SOME OF US THE EXCERCISE NELIFE THAT EXCERCISE CAN CAURE ANY DISASE OIR HGEALTH PROBLEM ANF IS THE MAGIC BULLET IS VERY SAD AS IF I HAVENT TRIED DONE ALL THE EXCERCISE TO CDUIRE MUYSELF OVER THIRTY YEARS .tHE RESOIN WHY PARTIENMTS WITH me FEEL SO ANGRY IS THEIR OWN POERSONAL EXPERIABNCXES ARE TOTALUYY DISREGARDED JUST LIKE GAY MEBN WITH aids WRTRE EXPLAIONEC AWAY AS A GAY DISASE WITH THOSE WIOTH hiv CAUSEING THE ILLNESS TYO GET ATENTION.
EVBERY TIME WE TRY TO WIN SOME GRIUND FOR OURSELFS IT EBBS AWAY IN EXCUSES FOR A PSYCHIATRIST WHO CAllked gulf war syndrome a rummor and mass poiosoning in englandf the camelford water poisonoinjg recenthystera he called that one a legend .N

Reply ↓
- Dikran Marsupial on January 14, 2016 6:42 AM at 6:42 am said:
  
  FWIW (single datapoint) GET can be useful in management of ME/CFS in some cases, for example mine. However getting an appropriate regime established is not straightforward and will vary from subject to subject (and I suspect there will be some subjects for whom it won’t be effective at all or will make things worse). I suspect that ME/CFS is not a single illness so there won’t be a single cure/management strategy, and it is just as misguided to apply the same strategy for every patient as it is to say that some particular strategy won’t help anybody. It is a pity (rather like climate change) that it has become such a polarizing issue.
  
  Reply ↓
- Lidia on April 17, 2016 9:48 PM at 9:48 pm said:
  
  The PACE Trial Gets Its Most Devastating Critique Yet: a full blown critique by Dr Rebecca Goldin, Professor of Mathematical Sciences and the head of a statistical organization.
  
  She writes, “The study is under increasing scrutiny by scientists and science writers about whether its conclusions are valid. The question of how all this happened and how the criticism is being handled have sent shockwaves through medicine”.
  
  She added, “The results from PACE… have been published in prestigious journals and influenced public health recommendations around the world; and yet, unraveling this design and the characterization of the outcomes of the trial has left many people, including me, unsure this study has any scientific merit. How did the study go unchallenged for five years?”
  
  This is a MUST read. It is written by someone who is themselves involved designing trials:
  https://www.stats.org/pace-research-sparked-patient-rebellion-challenged-medicine/
  
  Reply ↓
Tom K on January 13, 2016 10:49 AM at 10:49 am said:

CBT:
No improvement in employment measures;
No improvement in step test;
No improvement in 6-minute walking test.
Doesn’t sound very successful to me.

Some improvements in subjective outcome measures but they could be to all sorts of response biases due to seeing a therapist for a lot of appointments, time invested in the therapy, etc.
Also, “outcomes with SMC alone or APT improved from the 1 year outcome and were similar to CBT and GET at long-term follow-up”.

Reply ↓
- Tom K on January 13, 2016 12:23 PM at 12:23 pm said:
  
  Typo:
  Some improvements in subjective outcome measures but they could be *due* to all sorts of response biases.
  
  Reply ↓
- David E on January 17, 2016 8:07 PM at 8:07 pm said:
  
  Here are the facts, Mr. Wessely and Mr. Gelman, read and learn:
  
  • A scientific analysis of the PACE trial – https://www.me-ireland.com/bogus.htm#pace
  • The documented harms caused by exercise and exertion in cases of ME and CFS – https://www.me-ireland.com/scientific/16.htm
  • Scientific and medical evidence – https://www.me-ireland.com/scientific.htm
  • The abuse and harassment of ME / CFS patients – https://www.me-ireland.com/bogus.htm
  
  Reply ↓
A.B. on January 13, 2016 10:54 AM at 10:54 am said:

Andrew, I think you are still missing the point. The PACE trial was unblinded and based on subjective outcomes. Under these circumstances even homeopathy and sugar pills can appear effective. The authors have consistently acted in ways that increase bias rather than reducing it.

Reply ↓
Anoneuoid on January 13, 2016 11:32 AM at 11:32 am said:

We have to look at what they actually collected in terms of data, then assess whether the difference they observed is probably primarily accounted for by the treatment or maybe something else. I looked at White et al (2011) [1], who report a mean difference of ~3.3 on fatigue scores. Here is the fatigue scale: https://evaluatingpace.phoenixrising.me/aps3chalder.html

We can think of many possibilities, but I’ll give a few: Can a difference of that size be explained by the day of week the survey was filled out? What about time of day? Did all patients come in on the same day of week, were all of these patients run in parallel? What about rescheduling until a “better time”, did any of that occur?

[1] https://www.ncbi.nlm.nih.gov/pubmed/21334061

Just because there is a difference does not mean it was due to the treatment…

Then there is the issue of whether this measures fatigue, or also/instead something else. For example, the last question is (quite oddly) “How is your memory?”, where a “worse” score corresponds to better memory. The others are all asking for a comparison “to usual”, which also relates to memory. It would seem that a treatment that caused worse memory would result in a “better” score on this test, even if it did not affect fatigue at all.

Reply ↓
- Anoneuoid on January 13, 2016 12:14 PM at 12:14 pm said:
  
  I see now that last question about memory probably had the scale reversed.[1] That would be less bizarre, but does not address the memory component of this test. Worse memory may still move “usual” closer the current ill state.
  
  Also, from this paper: “CFS patients had a mean fatigue score of 24.4 (S.D. 5.8), while the community sample had a mean fatigue score of 14.2 (S.D. 4.6).” In the PACE paper linked above the mean at baseline was ~28 which decreased to ~24 at the first timepoint for all groups, so it looks like some regression to the mean is going on here as well.
  
  Another thing, they do talk about improvement (ie, recovery): “A secondary post-hoc analysis compared the proportions of participants who had improved between baseline and 52 weeks by 2 or more points of the Chalder fatigue questionnaire”
  
  [1] https://www.ncbi.nlm.nih.gov/pubmed/20630259
  
  Reply ↓
jimells on January 13, 2016 11:39 AM at 11:39 am said:

Dr Gelman, you missed where the study authors admit a null result at long term followup:

https://www.ncbi.nlm.nih.gov/pubmed/26521770

Rehabilitative treatments for chronic fatigue syndrome: long-term follow-up from the PACE trial.

“There was little evidence of differences in outcomes between the randomised treatment groups at long-term follow-up.”

You have also missed the point that a majority of patients report harms from GET:

https://iacfsme.org/PDFS/Reporting-of-Harms-Associated-with-GET-and-CBT-in.aspx

Reporting of Harms Associated with Graded Exercise Therapy and Cognitive Behavioural Therapy in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome

“Fifty-one percent of survey respondents (range 28-82%, n=4338, 8 surveys) reported that GET worsened their health while 20% of respondents (range 7-38%, n=1808, 5 surveys) reported similar results for CBT”

To sum up patient objections to this study:
1. The treatments are ineffective
2. The treatments cause harm

Reply ↓
Placeboid on January 13, 2016 11:48 AM at 11:48 am said:

It is true that the PACE trial has some strengths of a good quality RCT, but Wessely glossed over some major weaknesses in his Mental Elf article, which were quoted here in the current blog.

1) If blinding did not really matter in RCTs, drug trials would not bother with it as much as they do.

Wessely suggests: “One therapy was rated beforehand by patients as being less likely to be helpful, but that treatment was CBT. In the event, CBT came out as one of the two treatments that did perform better. If it had been the other way round; that CBT had been favoured over the other three, then that would have been a problem. But as it is, CBT actually had a higher mountain to climb, not a smaller one, compared to the others.”

What he did not mention is, while CBT may not be synonymous with simplistic “positive thinking”, CBT still encourages optimism on the issue of improvement and recovery from CFS, and aimed to change patients’ perceptions about their symptoms and disability. The therapy manuals (and presumably the therapists) also told patients how effective and safe CBT is. This may be like a drug trial biasing patients in favour of the drug on every dose, and then saying that such bias does not matter because the drug was rated poorly before they first took their first dose.

Wessely claims the two most recent systematic reviews rated PACE as having a low risk of bias. However, both those systematic reviews (the Cochrane review and at least the full version of the P2P review) rated the PACE trial as having a high risk of bias in terms of (non)blinding.

Without properly accounting for all biases e.g. the potential biases of subjective outcomes in non-blinded trials, an otherwise well-designed RCT may simply be measuring those biases. Subjective outcomes are important, but less reliable in trials where the effects are small to modest and contradicted by objective outcomes.

2) The PACE trial did have a high retention rate, but excluded 80% of candidates, so it is possible that it was a highly pruned cohort. Those less likely to stick around or less able to tolerate exercise may have simply been less likely to join the trial in the first place. Those who had a preference for a particular therapy offered in the trial were explicitly excluded (which is fair enough, but can interfere with assumptions about the generalisability of the results).

3) Wessely gives the impression that changes to the protocol were minor and unimportant. But as covered by others elsewhere, some changes were major and likely inflated estimates of clinical response by several fold, which is not minor. All thresholds for clinical improvement in fatigue and physical function, on an individual level, were post-hoc additions. There is also evidence that the recovery criteria was changed after the authors were already familiar with each component of the recovery criteria. Overlap between full recovery, and trial entry criteria for severe disabling fatigue, is unacceptable. There is still unpublished safety data too.

4) The improvement of 6MWD scores in the GET group was small, did not reach their own definition of clinically useful difference (0.5SD) and has since been attributed (in the Lancet Psychiatry editorial on the mediation paper) to pushing harder on the test rather than actually being fitter. Walking was the most commonly chosen activity in the GET group, so there may have been a training effect. Overall, a range of objective outcomes from the trial suggests no meaningful improvements, which contradicts the assumptions and goals of CBT and GET i.e. to increase function and activity. Similarly, in other trials of CBT/GET, actometers demonstrated no significant difference between groups at follow-up. It is clear that these therapies do not improve activity or function in the same way as commonly promoted.

5) An independent re-analysis of the individual-level PACE trial data is almost useless if it simply does the same analyses that the PACE group themselves conducted. Patients want the protocol-specified outcomes.

Reply ↓
A.B. on January 13, 2016 12:13 PM at 12:13 pm said:

> Indeed, when I asked Wessely what he thought of this, he replied that an independent review group in a different country had already re-analyzed some of the data and would be publishing something soon.

This presumably refers to a Cochrane review, which is not independent as claimed since three principal investigators of the PACE trial and the principal investigator of the FINE trial, were involved in writing the protocol for this review. See author list here:

https://onlinelibrary.wiley.com/doi/10.1002/14651858.CD011040/full

Reply ↓
- clark ellis on January 13, 2016 12:49 PM at 12:49 pm said:
  
  Yep, as it is Cochrane independence is assumed. But look at who is actually involved and very quickly you see that it cannot possibly be independent. This is ultimately the problem with PACE that things on the surface look acceptable so long as no one looks at the details and people like Wesley make all kinds of remarks like this and bank on you not looking at the details. Obviously Davis et al mean real independence else it is just the same people validating their own work.
  
  Reply ↓
  - A.B. on January 13, 2016 1:00 PM at 1:00 pm said:
    
    Well this is just a guess. I would be very surprised if this independent review met the requirements proposed by Davis and colleagues.
    
    We therefore urge The Lancet to seek an independent re-analysis of the individual-level PACE trial data, with appropriate sensitivity analyses, from highly respected reviewers with extensive expertise in statistics and study design. The reviewers should be from outside the U.K. and outside the domains of psychiatry and psychological medicine. They should also be completely independent of, and have no conflicts of interests involving, the PACE investigators and the funders of the trial.
    
    Reply ↓
Sasha on January 13, 2016 12:43 PM at 12:43 pm said:

Thank you, Andrew, for your continued interest in PACE. I’m afraid that in his support of PACE, Prof. Wessely repeatedly (and unaccountably) fails to address the main critical point levelled at PACE, even though people keep pointing it out to him. Just as he failed to mention this spectacular flaw in all 4,600 words of his Mental Elf article, he’s done it again with you.

PACE was a non-blind trial with subjective outcome measures, in which CBT and GET patients – but not the others – were told for six months that they could recover by their own efforts. They were then asked, in face-to-face assessments, to rate their improvement.

The social pressure for patients to say that they’d improved, even if they hadn’t, would have been tremendous.

I do not understand why Professor Wessely, clearly not a stupid man, appears not to see the massive source of bias here that would explain why CBT and GET patients rated themselves as more improved than the other groups.

Relying on self-rated measures in an open-label trial would be widely recognised by clinical trial designers as a fatal flaw in a randomised trial and it’s why drug trials have placebo controls. This is a very, very basic aspect of clinical trial design.

The objective measures in the trial – all used as secondary measures, for some bizarre reason – failed to show any difference, including a step-fitness test and economic measures such as days lost from work and the payment of sickness benefits (GET patients walked a short distance further than other groups in a walking test but the difference was very small, despite being statisically significant, and was plausibly attributed in an accompanying editorial to other factors).

PACE was a null trial, but few people seem to be able to face that truth. I still wonder if people simply can’t believe that a £5m trial can be a shoddy pile of nonsense. The published long-term follow-up data confirm that, once away from the researchers, patient who received CBT or GET on top of standard medical care rated themselves as no better than those who received standard medical care alone.

Reply ↓
- jimells on January 13, 2016 2:41 PM at 2:41 pm said:
  
  >I still wonder if people simply can’t believe that a £5m trial can be a shoddy pile of nonsense.
  
  If it were “only” a shoddy pile, we would just laugh at it and move on, like most of the useless CDC studies on “CFS”. Unfortunately this shoddy pile is used as a club to deny access to social support, benefits, and even medical care to alleviate symptoms as much as possible.
  
  The PACE trial and its authors have caused tremendous harm to patients and especially the families of patients who took their own lives. Advocates will continue to criticize PACE until it is retracted and the authors held to account for their actions.
  
  Reply ↓
- Martha on January 14, 2016 6:59 PM at 6:59 pm said:
  
  Sasha said: “PACE was a non-blind trial with subjective outcome measures, in which CBT and GET patients – but not the others – were told for six months that they could recover by their own efforts. They were then asked, in face-to-face assessments, to rate their improvement.
  
  The social pressure for patients to say that they’d improved, even if they hadn’t, would have been tremendous. ”
  
  I have heard this phenomenon referred to as “demand characteristics” https://en.wikipedia.org/wiki/Demand_characteristics
  
  Reply ↓
ExP on January 13, 2016 12:58 PM at 12:58 pm said:

Andrew,
Thank you for continuing to delve into the mirk. Having previously appeared to give succour to those determined that PACE must die, you should prepare yourself for a rapid fall in popularity amongst that constituency. As you identify the issue of greatest concern is the lack of open data – and the protestations that it’s all about blinding, ring very hollow, given that Professor White has recently called for all UK Universities to be exempt from Freedom of Information legisaltion, the very legislation upon which UK citizens rely to get access to data which those Universities, as with QMUL feel free obfuscate.

Open data is a matter of international concern, in the case of PACE there is also a more parochial issue and that is the way that a particular school of psychological research, with backroom support from NHS management, is using the vastly overstated value of PACE to continue garnering research funding in order to flog the dead horse of CBT/GET into a shape sellable to patients. In this scenario we aren’t looking at simply finessing research which has limited scope into something with wider application, but a use of scarce funding to back researcher prediliction. ME/CFS patients almost universally have well established differences of perspective from Professors, Wessely, Sharpe, White and Chalder and want their health service to value their voices at least as strongly as any academically derived weak evidence base. There isn’t a methodological solution for that problem, ultimately a social and political solution is required.

Reply ↓
- A.B. on January 13, 2016 1:31 PM at 1:31 pm said:
  
  > As you identify the issue of greatest concern is the lack of open data – and the protestations that it’s all about blinding, ring very hollow, given that Professor White has recently called for all UK Universities to be exempt from Freedom of Information legisaltion, the very legislation upon which UK citizens rely to get access to data which those Universities, as with QMUL feel free obfuscate.
  
  I think you’re a little confused here. Are you really trying to say that we should not take concerns about lack of blinding seriously because one of the authors is trying to make it harder for the public to get access to research data? There is no logical argument here.
  
  I also have a question to you: do you think the problem is CBT and GET could work but don’t appeal to patients, or that CBT and GET don’t actually work?
  
  Seems to me that the first case applies, in which case I’m interested on what research you’re basing your opinion on (PACE is the largest trial, and it failed to find a benefit).
  
  Reply ↓
Keith Geraghty on January 13, 2016 1:04 PM at 1:04 pm said:

Having close colleagues, friends and co-workers, heavily aligned to your own perspective of this illness, would hardly constitute an appropriate set of experts to evaluate the robustness and validity/reliability of the PACE trial.

The lack of independent scrutiny of the PACE trial – and the manner in which it and other trials/reviews of CBT and GET are funded – requires very careful reappraisal in my opinion in order to safe guard patients (given such trial data is used as EBM in NiCE guidelines and shapes policy and practice in the NHS).

The PACE trial will stand as an example of how not to do it. Prof. Wessely is unlikely to critise a trial he has supported from inception, has been undertaken by his close colleagues, some within his department, most of whom
promote a treatment approach he helped shape / again we have a lack of independence in defence of PACE.

That may be the essential theme – lack of independent thinking and scrutiny.

Reply ↓
Sasha on January 13, 2016 1:06 PM at 1:06 pm said:

Andrew, you wrote that the Lancet commentary appeared to be mistaken in describing the ludicrous “normal ranges” for fatigue and physical function as a “strict criterion for recovery.” You pointed out that in later correspondence, the PACE authors wrote, “It is important to clarify that our paper did not report on recovery; we will address this in a future publication.” And then you said, “That was a few years ago; the future has happened; and I guess recovery was not so easy to assess. This happens a lot in research: early success, big plans, but then slow progress.”

Are you aware that in 2013, the PACE authors published a paper called “Recovery from chronic fatigue syndrome after treatments given in the PACE trial”, using exactly these normal ranges as two of their four recovery criteria? They had changed all four criteria from what had been specified in the study protocol such that it would have been considerably easier to “recover”.

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3776285/

Data from PACE’s sister-trial, the FINE trial, show that if you change the SF-36 physical function recovery threshold in the way it was change in PACE, you get a six-fold increase in the number “recovered”. Worrying stuff.

A good example of why post-hoc changes to clinical trial analysis protocols are seen as a very, very bad thing.

Reply ↓
L.T.R. on January 13, 2016 1:39 PM at 1:39 pm said:

Getting some pressure from colleagues, Andrew? We know the drill. But in reality, we are ALL smarter than you think we are and the smarts we have refuting everything the authors of the PACE Trial say are too well documented now.

Sitting on the fence will bury your career in the long run. So stop being scared of these UK Psychiatric “pills” because you will go down with them once the data is revealed. If Tom Kindlon and other patients as well as ME/CFS organizations and journalists and other psychologists can see right through outcome switching and you can’t (I think you can) and then won’t speak out with clarity and are starting to wobble due to pressure, you will go down in history as a wannabe whistle blower who couldn’t quite cut it.

Reply ↓
- Andrew on January 13, 2016 1:46 PM at 1:46 pm said:
  
  Ltr:
  
  No, no pressure; I don’t actually know any of the people involved (except for Bruce Levin, but I haven’t run into him in awhile and I’ve never talked with him about this study). I just not an expert in the field, certainly not any kind of “whistleblower” here given that I have zero information and am just working with the published materials and what ever people email me. There are experts in this field and I’m not one of them. What happened was that I got an email from Wessely and I thought it appropriate to place his views in the context of my earlier posts. I have no problem with commenters who follow up and add information. I’m not trying to present any kind of last word here, I’m just trying to air the discussion from a statistical perspective.
  
  Reply ↓
  - L.T.R. on January 13, 2016 1:57 PM at 1:57 pm said:
    
    Bad news, you are his PAWN. He isn’t stupid as he knew you would present his information just how he wanted and you would write without clarity on the subject.
    
    If only he would write to Tuller or Coyne.
    
    Reply ↓
    - Anonymous on January 13, 2016 2:27 PM at 2:27 pm said:
      
      I get it that people are upset about PACE, and that it’s even maybe justified to be a bit paranoid considering the conspiracy like behavior of the PACE guys… but one of the issues here with PACE is that because it was so crappy and so harmful in its policy consequences, and soforth, the opponents of PACE have begun to sound like nutjobs.
      
      Frustration + nutjobbery is not going to help the anti-PACE cause. Accusing someone like Andrew in his own blog of being pressured and used like a Pawn by powerful forces… Not a good strategy in my opinion.
    - L.T.R. on January 13, 2016 2:46 PM at 2:46 pm said:
      
      We are out of strategy time and in reality have been. Every suicide is on them. Every death due to no money being put into bio research IS ON THEM!
    - A.B. on January 13, 2016 2:48 PM at 2:48 pm said:
      
      A troll posing as patient is also a possibility.
    - L.T.R. on January 13, 2016 2:55 PM at 2:55 pm said:
      
      As usual, we ME/CFS patients around the world have to prove ourselves as real bio patients, as patients suffering in front of our own doctors, to our friends and family and as real patients on a website.
      
      You are a troll of all the very sick patients coming here.
    - Anonymous on January 13, 2016 3:35 PM at 3:35 pm said:
      
      A.B. is on your side, he’s saying that you seeming to be a troll might be due to you really being some PRO PACE person trying to discredit patients by pretending to show how patients are all trolls anyway. Instead, as a patient, you are unable to see that he is defending patients, and you attack him.
      
      Sure, patients are rightfully angry, but apparently this has turned them into people who can’t advocate for themselves and who appear externally to be nutjobs, and they attack even their own advocates (of which A.B is clearly one).
      
      so, I reiterate, this is not good strategy. The alternative to good statistics is not no statistics, it’s bad statistics, the alternative to good strategy is not no strategy it’s bad strategy.
    - jimells on January 13, 2016 4:17 PM at 4:17 pm said:
      
      There is every reason for patients to be extremely pissed off about our mistreatment. And just as there is a time and place for anger, there is also a time and place for diplomacy. This blog is one of them.
Jonathan on January 13, 2016 2:33 PM at 2:33 pm said:

Dear Andrew,
I am perplexed. Like you I found myself having to assess the value of this trial as an outsider. Maybe the problem is that this is simply not in issue of statistics. In a sense it is not. But surely statisticians are interested in the use of blinding of treatments to avoid bias. At least bias is a statistical issue? When I raised the concern that this is an unblinded trial with subjective endpoints the reply I had from those close to PACE was that lots of trials are unblinded and lots of trials have subjective endpoints and they are considered robust.

This reply indicates a complete failure to understand that blinding is designed to remove the bias due to subjective endpoints and that therefore the problem is only ever with the COMBINATION of lack of blinding and subjective end points. As I understand it nobody in clinical pharmacology would look twice, or even read through, a trial that had both issues. Surely this is basic stuff that you would teach to students? If I presented a trial, with this design, of a drug, I would be laughed off the podium. The FDA or MCA would rightly throw it in the bin. We used to have trials of physiotherapy like this but even the physios learnt that no credible journal would publish them.

I cannot understand why this is not crystal clear to all? And intelligent patients have pointed out to me that this is just the start of it because CBT is deliberately designed to make people biased in their reports – it tries to make them believe they are better, that is its sole objective. So how can the PACE trial even get to first base?

Reply ↓
- Andrew on January 13, 2016 2:46 PM at 2:46 pm said:
  
  Jonathan:
  
  I can’t be sure, but I’m guessing that the reason the PACE trial could get to first base is that it’s hard to blind a talk-therapy treatment. Also it was my impression that the study included both objective and subjective endpoints, in which case I can see the criticism of an opportunistic analysis but the study could still be useful.
  
  Regarding my own take on all this: I’m not sure, again, this whole thing is far from my substantive areas of expertise. After my first couple of posts on PACE, I was struck by the complete lack of defenders of the study. In addition the behavior of journal editor Richard Horton didn’t seem appropriate, and I recalled that whole thing from ten years ago with Lancet’s Iraq mortality survey. Seeing as I’d heard nothing good about PACE from anybody, once I did hear from Wessely, it seemed like the best course of action to post his reactions side by side with the criticisms of the study.
  
  Reply ↓
  - Valentijn on January 13, 2016 3:12 PM at 3:12 pm said:
    
    The objective measurements were not used in determining recovery. And the only one showing any improvement was for GET in the 6MWT, which the authors admit was not large enough to be “clinically significant”. Employment, reliance on welfare benefits, etc, showed no improvement, but the authors argued they were irrelevant or unreliable for various reasons.
    
    Actometer use was dropped, possibly due to CBT/GET trials in the Netherlands having shown that there was no increase in activity despite improved fatigue scores. Step test results have been shown in the form of an excessively stretched graph, but requests for the numerical values for the handful of points on that graph have been rejected as “vexatious”. Analysis of the graph with a software program suggests that CBT and GET patients scored quite a bit (significantly?) worse than APT and SMC patients.
    
    The excuse of the authors is that “CFS symptoms are subjective, therefor the primary outcomes should be too.” But this is extremely dishonest. Post-exertional malaise is documented via a 2-day CardioPulmonary Exercise Test. Orthostatic Intolerance can be seen on a Tilt Table Test. Cognitive impairment is verified on specific neuropsychological tests. It is also quite odd to model a treatment based on the supposed inability of patients to accurately interpret their own symptoms, then claim that patient questionnaire scores are the only way to determine if patients are cured.
    
    Reply ↓
    - Anoneuoid on January 13, 2016 3:21 PM at 3:21 pm said:
      
      >”Actometer use was dropped, possibly due to CBT/GET trials in the Netherlands having shown that there was no increase in activity despite improved fatigue scores.”
      
      Interesting. I bet a major (maybe not primary) component of these fatigue scores is actually memory, and improved memory looks like more fatigue (see my post above). Has this been assessed anywhere?
    - Valentijn on January 14, 2016 3:38 AM at 3:38 am said:
      
      The scale used is the Chalder Fatigue Scale, devised some years ago by one of the principle investigators. It uses 11 questions to ask about mental and physical fatigue. The questions are listed at the top of the page at https://evaluatingpace.phoenixrising.me/aps3chalder.html with more details about its use in the PACE trial further down the page.
      
      It’s not a scale that patients seem to have any use for, but psychosocial researchers seem to like using it to show that less fatigue is reported after months of telling patients that they will be cured if they ignore their fatigue.
      
      I don’t think there have been any trials showing improvement on cognitive tests after GET or the illness-denial CBT used by these investigators.
    - Anoneuoid on January 14, 2016 3:59 PM at 3:59 pm said:
      
      I looked into it a bit and commented up here:
      https://statmodeling.stat.columbia.edu/2016/01/13/pro-pace/#comment-259135
      
      From looking around a bit I didn’t see any references regarding memory. Am I to understand that no one knows if this “fatigue score” may measure memory instead of (or in addition to) fatigue?
  - jimells on January 13, 2016 3:16 PM at 3:16 pm said:
    
    >After my first couple of posts on PACE, I was struck by the complete lack of defenders of the study.
    
    I’d like to point out that it is only the “Wesseley School” and fellow travelers that receive such withering criticism. Research papers by psychologists like Leonard Jason, for example, are closely examined, but there are no calls for retraction, etc.
    
    Reply ↓
  - urbantravels on January 13, 2016 3:21 PM at 3:21 pm said:
    
    “Also it was my impression that the study included both objective and subjective endpoints, in which case I can see the criticism of an opportunistic analysis but the study could still be useful.”
    
    This is why the outcome switching has such a suspicious appearance. The objective measures were dropped or watered down after the trial was underway, and the subjective measures were given undue prominence. Subsequently, the refusal to release the trial data, which would allow analysis of outcomes per the original trail protocol, raises further questions. Plausible-sounding excuses for each link in this chain of convenient coincidences are always on offer, of course.
    
    “To lose one objective outcome measure, Mr. Worthing, may be regarded as a misfortune: to lose so many looks like carelessness. ”
    – Not Oscar Wilde
    
    Reply ↓
  - A.B. on January 13, 2016 3:24 PM at 3:24 pm said:
    
    Quoting Tuller
    
    *The investigators abandoned all the criteria outlined in their protocol for assessing their two primary measures of fatigue and physical function, and adopted new ones (in the 2011 Lancet paper). They also significantly relaxed all four of their criteria for defining “recovery” (in the 2013 Psychological Medicine paper). They did not report having taken the necessary steps to assess the impacts of these changes, such as conducting sensitivity analyses. Such protocol changes contradicted the ethos of BMC Neurology, the journal that published the PACE protocol in 2007. An “editor’s comment” linked to the protocol urged readers to review the published results and to contact the authors “to ensure that no deviations from the protocol occurred during the study.” The PACE team has rejected freedom-of-information requests for the results as promised in the protocol as “vexatious.”
    
    *The study’s two primary outcomes were subjective, but in the 2007 published protocol the investigators also included several “objective” secondary outcomes to assess physical capacity, fitness and function; these measures included a six-minute walking test, a self-paced step test, and data on employment, wages and financial benefits. These findings utterly failed to support the subjective reports that the authors had interpreted as demonstrating successful treatment and “recovery.” In subsequently published comments, the authors then disputed the relevance, reliability and “objectivity” of the main objective measures they themselves had selected.
    
    https://www.virology.ws/2015/10/21/trial-by-error-i/
    
    Primary outcomes are used to determine the effectiveness of an intervention. Secondary outcomes may yield additional information.
    
    So, PACE started with a reasonable protocol and morphed into an uninterpretable mess along the way.
    
    Since blinding is impossible in psychotherapy trials, abandoning objective primary outcomes is inexcusable, since they would have provided a dimension of objectivity against the inevitable bias in subjective outcomes.
    
    It should also be kept in mind that the “objective” outcomes were not truly objective. These are still to some degree under the control of the patient, just to a lesser degree (but much less so than asking patients to rate their health).
    
    Reply ↓
  - Jonathan on January 13, 2016 3:30 PM at 3:30 pm said:
    
    Surely, this is actually bang in the middle of your expertise as a statistician, Andrew. An surely, there is never a case for ‘We cannot do it in such a way as to get reliable evidence for this sort of treatment so it must be good enough to get unreliable evidence.’ Isn’t being a statistician about saying, er no? And yes, there were objective endpoints, some of which seemed to get dropped off the agenda and the others showed no benefit – and very convincingly. In fact the study is probably very useful, as Tom Kindlon has argued. It shows that the treatments had no clinically significant effect.
    
    Are you also aware of the follow up paper, which is in the middle of this debate. It claimed to show that the benefit continued. Yet anybody with an ounce of common sense would make the opposite conclusion since at the later time point it made no difference what treatment you had had – the other patients had got just as much better. Surely there is some maths to be found in ‘no difference after all’?
    
    I am really baffled by the fact that almost nobody commenting on this trial can see a flaw that would be used by my colleagues to wipe the floor with the presenter at Grand Rounds. Even my co-authors on the letter to the Lancet seem to concentrate on minutiae, to be honest. Maybe I belong to another generation. Maybe nobody cares any more.
    
    Reply ↓
    - Andrew on January 13, 2016 3:37 PM at 3:37 pm said:
      
      Jonathan:
      
      By “the follow up paper,” do you mean the one with the mediation analysis? I hated that!
    - adrian on January 13, 2016 3:58 PM at 3:58 pm said:
      
      I think by the follow up paper Jonathan means the long term follow up paper that was published recently.
      
      https://www.thelancet.com/journals/lanpsy/article/PIIS2215-0366%2815%2900317-X/abstract
      
      The basic result showed no difference between the treatments however, the paper was used to push the CBT and GET arms as their ‘improvement’ held up however, the other groups improved more.
- Martha on January 14, 2016 7:26 PM at 7:26 pm said:
  
  Yes, the lack of blinding does concern me. In drug trials, pains are taken to be sure that pills in two arms of a clinical trial look identical, and other measures to create true blinding. (e.g., see https://www.medicine.ox.ac.uk/bandolier/booth/glossary/blind.html).
  
  Andrew’s comment “I’m guessing that the reason the PACE trial could get to first base is that it’s hard to blind a talk-therapy treatment,” may be an understatement — “impossible” probably fits better than “hard”. Still, his comment, “but the study could still be useful,” needs to be viewed with skepticism; “useful” would need a good argument as to why the lack of blinding does not make the results meaningless.
  
  Reply ↓
John on January 13, 2016 2:51 PM at 2:51 pm said:

Keith Geraghty is right- Simon Wessely is listed on the PACE trial protocol under ‘centre leaders and co-leaders’ and “is unlikely to critise a trial he has supported from inception, has been undertaken by his close colleagues, some within his department, most of whom
promote a treatment approach he helped shape”. Why would you expect a critical analysis?

Here’s the thing- for Simon Wessely (not to mention the rest of the PACE investigators) to say that “the actual outcome measures did not change” because the same measuring instrument was used that was described in the trial protocol is kind of at the heart of what’s wrong with PACE.

Say you had a trial on growth hormone, where the entry criteria was 65 inches and was described as being ‘very short’, with a successful outcome being 85 inches, described as ‘very tall’, with both of these heights being measured with a yardstick or measuring tape, etc. Then during the course of the trial the outcomes were changed so that a height of 60 inches was substituted as part of the criteria for being ‘very tall’ and the mid-range height of 75 inches for a ‘positive outcome’ was disappeared completely, with the investigators and their supporters repeatedly denying that the outcomes were changed because they still used a yardstick to measure the participants’ heights at the end of the trial.

The problem is that a yardstick is an instrument, not a measure, and the outcome measures were most certainly drastically changed. In fact I can’t think of even one outcome from PACE that wasn’t fiddled with from those laid out in the trial protocol, with virtually every change making it either easier for the authors to claim success and/or harder for participants to report being made worse. When you can’t even get a straight answer out of someone involved with the trial on such a simple and blatantly obvious point, where does that leave you with the more complex nuances? When you need an advanced degree in wordsmithing and an open dictionary in front of you just to discuss such a simple concept why would you even bother going any deeper than that? Just hand over the goddamn data and STGFU, please.

Also, when the therapies under discussion are not just similar to, but actually consist of ‘essence of the placebo response’ (as described by one of the authors himself), getting the outcomes accurately sorted is of primary importance before delving into the deeper issues involved.

PS- In the original Lancet paper, the 6MWT results were marginally higher in the GET group than in the other groups, but GET was also the trial arm with the largest amount of missing data for this test. This means that it is entirely possible that participants who either didn’t improve or possibly were made worse simply didn’t take the test and therefore the reported improvements are spurious. Even if they weren’t, the 6MWT scores in all arms of the trial were still worse at the end of the trial than patients with class III heart failure and 80-89 year olds.

I’m not aware of any other objective result that improved in PACE and again, this is why PACE is such a f(*&ing joke- no critical analysis at all from trial supporters, only a bunch of wishy-washy bs that completely avoids legitimate shortcomings and is being vastly oversold to an uncritical audience via the help of an ostensibly ‘objective’ public relations team, aka the UK Science Media Centre, which Simon Wessely also happens to be a Trustee of. Here’s an idea for a new themesong for PACE, the UK media and academic journals- Deference to Authority in the UK!

Reply ↓
- Mike on January 13, 2016 5:05 PM at 5:05 pm said:
  
  The missing GET data might explain the abandonment of the intention-to-treat analysis that was promised in the protocol.
  
  Reply ↓
S.T. on January 13, 2016 3:11 PM at 3:11 pm said:

Andrew, you said: “I can’t be sure, but I’m guessing that the reason the PACE trial could get to first base is that it’s hard to blind a talk-therapy treatment. Also it was my impression that the study included both objective and subjective endpoints, in which case I can see the criticism of an opportunistic analysis but the study could still be useful.”

The problem in PACE was that subjective outcomes were the primary outcomes and – apart from the walking test that showed a very small but statistically significant increase that favoured the GET group – the objective outcomes (all predefined as secondary) weren’t published in the main 2011 Lancet paper but much later, giving a very misleading impression of the study outcome. In particular, the crucial but dead-flat null step-fitness test wasn’t reported until four years later and even then, it was buried in a graph so small that you can’t read the values. Someone wrote to ask for the values and it was turned into a Freedom of Information request and dismissed as “vexatious”.

The economic analyses (also defined by the authors as “objective”) were also null.

So yes, the study could be useful in demonstrating that, according to objective measures, CBT and GET didn’t work in the PACE trial but Prof. Wessely and the PACE trial authors focus on the subjective measures, dismiss the objective ones that they don’t like and thus spin the results as though they favour their preferred therapies.

I’m afraid that Prof. Wessely is very far from being a disinterested commentator. As he says in the Mental Elf article that you quote, “I helped recruit some patients to the study… I am not a neutral observer… I consider the most senior [PACE investigators] to be personal friends. Do I have competing interests? Sure I do.” More to the point, he is a strong proponent of this psychological approach to CFS.

Reply ↓
Zach on January 13, 2016 4:44 PM at 4:44 pm said:

Andrew, thanks for your continued interest in the PACE trial. It’s a complex and sometimes nuanced subject, and I think it raises lots of interesting questions in relation to methodology and transparency in medical trials in general. For example, issues such as: “outcome switching” (i.e. using post-hoc endpoints instead of pre-defined endpoints); not sharing publicly-funded data with other researchers and the public; the reliability of open-label methodology; non-blinding of data; the lack of an appropriate placebo control of comparison intervention; bias surrounding self-report measures; influencing patient expectations by informing them that an intervention is highly effective, etc.

As others have said, such biases and weaknesses in trial design and methodology would be expected to demonstrate efficacy for homeopathy. It is interesting to note that CBT demonstrated no efficacy for any of the objectively measured outcomes used in the trial, and demonstrated modest efficacy only for the self-report measures [1,2,3]. A lack of improvement in objectively measured outcomes suggests that the illness itself has not been modified by the interventions. This is especially the case in this trial which attempted to increase physical fitness and activity but failed on objective measures.

The methodology used for the PACE trial would not be accepted as robust evidence for the approval of pharmaceuticals; pharmaceutical trials would normally require blinding, and a placebo control arm and/or comparison with an established intervention. (PACE used no placebo control, and it lacked a robust comparison with an established and previously tested intervention.)

Apart from the outcome switching, the lack of an appropriate comparison or control arm, and lack of transparency, perhaps there’s no single stand-out issue that would make a casual observer sit up and take notice with regards to methodological weaknesses, but when all of the issues are combined, it becomes interesting.

From a statistical point of view, there may not be much of interest in it. Most of the criticisms aren’t related to statistical issues, but are related to other methodological issues. Perhaps the most interesting statistical issue is the recovery criteria [4]. Some basic errors were involved here, such as using a non-representative demographic sample to determine recovery thresholds, and inappropriately using a mean and standard deviation for data that doesn’t have a normal distribution, to calculate the normal range. This is what has given us the ridiculous situation whereby a patient could deteriorate in both of the primary outcomes measures (fatigue and physical function) and be classed as ‘recovered’. This is not only related to the original Lancet commentary [5], but it also relates to a 2013 recovery paper [4]. Obviously, if “recovery” can indicate deterioration, then this isn’t helpful. Unfortunately health care professionals, family, and friends, see the discussion and headlines and can take them at face value. Even health professionals, and clinical decision makers, are rarely expected to dig into data to assess whether recovery criteria are appropriate.

The factually incorrect Lancet commentary resulted in erroneous media reports such as: “About 30 per cent of patients given cognitive behavioural therapy (CBT) or graded exercise made a full recovery to normal levels of activity, the study found…” [6] Some media headlines were outrageously exaggerated; The Daily Mail reported that ME patients should “push themselves to their limits” for the “best hope of recovery” [7].

Such headlines and misinformation have an accumulative effect on the patient community, and are not simply forgotten about and cannot simply be labelled as a historic issues, especially if the misinterpretations of data have not been corrected. If health-care providers believe that ME/CFS is not a real illness and can simply be cured with some exercise, then it harms patient care. ME/CFS is a notoriously neglected illness, according to the patient community.

In the discussions on your blog, we have mainly been discussing the results at 52 weeks post-randomisation [1], but long-term outcomes at a median of 2.5 years have also been published, which showed no difference between trial arms [8]. i.e. when CBT and GET were added to standard medical care, there was no clinical benefit in the long-term. (CBT and GET were no different to receiving no-treatment.)

Unfortunately, the PACE investigators have not been clear about these outcomes either in their discussions in the published paper or in their communications with the media. They have repeatedly stated that there was a ‘sustained’ clinical benefit from CBT and GET, which the ME community has had to spend a lot of time correcting [13]. For example, both the follow-up paper itself and the accompanying press release state: “Researchers have found that two treatments for Chronic Fatigue Syndrome have long term benefits for people affected by the condition.”

However, during the period between 1 year and 2.5 years after randomisation, SMC & APT actually performed better in the self-report primary outcomes than CBT & GET. So it might have been the case that CBT and GET inhibited improvement in health after 52 weeks. This is the opposite of a ‘sustained benefit’.

I’m not aware of the investigators having publicly clarified to the media that there was no treatment effect at long-term. In addition to the confusion surrounding the recovery outcomes, this lack of clarity adds further confusion to the mix.

A further media onslaught of misinformation, related to the follow-up study, has further added to the confusion surrounding the illness. In a glowing report of the PACE trial, The Daily Telegraph repeated the claims of the press release and went as far as to claim that ME/CFS is not a chronic illness [9] (This claim has now been retracted after complaints by the community) [13]. Even the NHS Choices website (which is usually a reliable source) has repeated the spin that there was a long-term benefit from CBT and GET [10].

One other thing. In your blog, you refer to Simon Wessely’s comment on the National Elf Service Blog [11]. In the comment, Simon Wessely says that he had limited involvement with the PACE trial: “I was not on the ship, neither as passenger or crew. I helped recruit some patients to the study from our clinic, as did many doctors, but that was as far as it went. I am not an author on the ship’s log, but I am not a neutral observer.” [11] However, this doesn’t seem to be supported by other available information. e.g. the trial protocol says: “The authors thank Professors Tom Meade, Anthony Pinching and Simon Wessely for advice about design and execution.” [12] And the acknowledgements of the 2011 paper says: “Simon Wessely commented on an early draft of the report.” [1] I just wanted to mention this to clarify that Simon Wessely is not an independent observer.

In conclusion, there are many different methodological issues associated with the PACE trial, but there are also issues related to misinformation surrounding the promotion of the therapies, and how this affects the patient community over the long term.

References:

1. White PD, Goldsmith KA, Johnson AL et al. (2011) Comparison of adaptive pacing therapy, cognitive behaviour therapy, graded exercise therapy, and specialist medical care for chronic fatigue syndrome (PACE): a randomised trial. Lancet 377:823-36.
2. McCrone P, Sharpe M, Chalder T, Knapp M, Johnson AL, Goldsmith KA, White PD. Adaptive pacing, cognitive behaviour therapy, graded exercise, and specialist medical care for chronic fatigue syndrome: a cost-effectiveness analysis. PLoS ONE 2012; 7: e40808.
3. Chalder T, Goldsmith KA, White PD, Sharpe M, Pickles AR. Rehabilitative therapies for chronic fatigue syndrome: a secondary mediation analysis of the PACE trial. Lancet Psychiatry 2015; 2:141–52
4. White PD, Goldsmith K, Johnson AL, Chalder T, Sharpe M. (2013) Recovery from chronic fatigue syndrome after treatments given in the PACE trial. Psychol Med. 43:2227-35.
5. Bleijenberg G, Knoop H. Chronic fatigue syndrome: where to PACE from here? Lancet 2011; 377:786-8
6. The Times: https://www.thetimes.co.uk/tto/health/news/article2917876.ece
7. Daily Mail: https://www.dailymail.co.uk/health/article-1358269/Chronic-fatigue-syndrome-ME-patients-exercise-best-hope-recovery-finds-study.html
8. Sharpe M, Goldsmith KA, Johnson AL, Chalder T, Walker J, White PD. Rehabilitative treatments for chronic fatigue syndrome: long-term follow-up from the PACE trial. Lancet Psychiatry 2015; 2:1067-74.
9. Telegraph: https://www.telegraph.co.uk/news/health/11959193/Chronic-Fatigue-Syndrome-sufferers-can-overcome-symptoms-of-ME-with-positive-thinking-and-exercise.html
10. NHS Choices: https://www.nhs.uk/news/2015/10October/Pages/Exercise-and-therapy-useful-for-chronic-fatigue-syndrome.aspx
11. https://www.nationalelfservice.net/other-health-conditions/chronic-fatigue-syndrome/the-pace-trial-for-chronic-fatigue-syndrome-choppy-seas-but-a-prosperous-voyage/ (accessed 13th Jan 2016.)
12. White PD, Sharpe MC, Chalder T, DeCesare JC, Walwyn R; PACE trial group. Protocol for the PACE trial: a randomised controlled trial of adaptive pacing, cognitive behaviour therapy, and graded exercise, as supplements to standardised specialist medical care versus standardised specialist medical care alone for patients with the chronic fatigue syndrome/myalgic encephalomyelitis or encephalopathy. BMC Neurol. 2007 Mar 8;7:6.
13. https://www.meassociation.org.uk/2015/11/we-settle-our-differences-with-the-daily-telegraph-19-november-2015/

Reply ↓
- Moss on January 13, 2016 6:52 PM at 6:52 pm said:
  
  “Perhaps the most interesting statistical issue is the recovery criteria [4]. Some basic errors were involved here, such as using a non-representative demographic sample to determine recovery thresholds, and inappropriately using a mean and standard deviation for data that doesn’t have a normal distribution, to calculate the normal range.”
  
  White et al were forced to concede the errors.
  
  But amazingly, White (the lead author of PACE) has repeated the same errors in another recent paper he co-authored.
  
  It is hard to believe it was just a mistake the second time around.
  
  Collin Simon M., Nikolaus Stephanie, Heron Jon, Knoop Hans, White Peter D., Crawley Esther,
  Chronic Fatigue Syndrome (CFS) symptom based phenotypes in two clinical cohorts of adult patients in the UK and The Netherlands
  Journal of Psychosomatic Research(2015), doi: 10.1016/j.jpsychores.2015.12.006
  
  Reply ↓
- Mike on January 14, 2016 4:17 AM at 4:17 am said:
  
  “From a statistical point of view, there may not be much of interest in it.” Generally agreed, but there was “analysis switching” as well as “outcome switching”. The 2007 protocol promised intention-to-treat analysis, where dropouts are included but count as failures of the randomized treatment. This is important in pragmatic trials designed to test real world effectiveness, such as PACE.
  
  However, in the 2011 Lancet paper they imputed hypothetical scores for the patients lost to follow-up. Why bother? There were few dropouts in the main (subjective questionnaire) outcomes, so why not stick to ITT? As John pointed out (at 2.51 pm), there were many GET dropouts for the 6-Minute Walking Test. This is the red flag that suggests the reported 6MWT result for the GET arm may be a false positive.
  
  Reply ↓
RCT on January 13, 2016 5:13 PM at 5:13 pm said:

When CBT is hammered into the RCT framework the end result is often an increase in clinical uncertainty
https://www.methodsappraisal.com/2012/08/bmc-psychiatry-adult-adhd/

Reply ↓
Carolyn Wilshire on January 13, 2016 5:56 PM at 5:56 pm said:

“This seems reasonable to me, and not in contradiction with the points that Wessely made. Indeed, when I asked Wessely what he thought of this, he replied that an independent review group in a different country had already re-analyzed some of the data and would be publishing something soon. So maybe we’re closer to convergence on this particular study than it seemed.”

Wouldn’t be nice if all researchers could hand-pick who they want to “independently review” their data in this way?

Reply ↓
Moss on January 13, 2016 6:24 PM at 6:24 pm said:

Wessely has described the PACE trial as “large and elegant”, and “a thing of beauty”. [1,2]

PACE was testing the illness model that Wessely was instrumental in creating. He supplied patients and advice to the PACE trial, and has defended and promoted it (and in particular the CBT & GET components of it) very vigorously indeed.

More disturbingly, he has continued to do so even after the null result reported in the recent 2.5 year follow-up paper from PACE:

“(CBT and GET) are superior to adaptive pacing or standard medical treatment, when it comes to generating improvement in patients with chronic fatigue syndrome,…” [2]

There is no possible way that interpretation can be sustained by the data in the follow-up paper. It was clearly a null result. The paper itself even said so:

“There was little evidence of differences in outcomes between the randomised treatment groups at long-term follow-up.”

Wessely is most certainly not a neutral objective observer on PACE. To the contrary, his ideas are at the core of how PACE was set up and run. He has a massive vested professional and personal interest in seeing it report a particular outcome.

1. Nature Reviews Neuroscience 12, 539-544 (September 2011) | doi:10.1038/nrn3087

2. https://www.nationalelfservice.net/other-health-conditions/chronic-fatigue-syndrome/the-pace-trial-for-chronic-fatigue-syndrome-choppy-seas-but-a-prosperous-voyage/

Reply ↓
- Moss on January 13, 2016 6:32 PM at 6:32 pm said:
  
  Forgot to add the reference for the 2.5 year follow-up paper:
  
  Sharpe M, Goldsmith KA, Johnson AL, Chalder T, Walker J, White PD. Rehabilitative treatments for chronic fatigue syndrome: long-term follow-up from the PACE trial. Lancet Psychiatry 2015; 2:1067-74.
  
  Reply ↓
Nasim Marie Jafry on January 13, 2016 6:42 PM at 6:42 pm said:

Hi, Andrew, For what it’s worth, Simon Wessely does not study ‘classic ME’ as described by the late Dr Melvin Ramsay, he researches/ researched and treats chronic fatigue syndrome, defined by the much criticised and maligned Oxford criteria. I have had emails with Simon – as others with ME have – and it is crystal clear that he does not agree with Melvin Ramsay, or indeed the consultant neurologist who diagnosed me in 1984, on the essence of ME. He – and other psychiatrists – have tried for years to psychologise a complex neuroimmune illness, albeit a poorly understood one. PACE is simply an extension/continuation of the psychologising of the illness, and it is dangerous. But as serious biomedical research progresses, this charade, and it is a charade, of conflating ME with unexplained fatigue will end. Only then will patients with ME have a fair chance of treatment, at least in UK. I often thank my lucky stars that I was diagnosed when I was, pre- influence of this particular group of medics. The involvement of these psychiatrists has been a catastrophe for ME patients. And I don’t exaggerate. It makes me weep. I’ve been ill since 1982, diagnosed in 1984, and I think if walking and talking were going to improve my symptoms, I would jump at the chance, no pun intended.

Reply ↓
- Findlow on January 14, 2016 9:23 AM at 9:23 am said:
  
  I agree with you wholeheartedly, Nasim Marie. As another very long-term pwME, it is heartbreaking to see the catastrophe – as you rightly call it – still pervade. I am horrified to read just now about the attack on the great paediatrician Dr Nigel Speight, whose invaluable support for families of children with ME will be well aware of. But like other doctors who rejected the biopsycho/Wesselyite/PACE nonsense it seems the knives have long been out for him, and now Crawley et al have gone to the GMC. Outrageous and heartbreaking to read of a good doctor being treated like this, and yet another patient harmed by forced exercise – ambulant when admitted to Bath hospital and bed-ridden on discharge.
  
  https://spoonseeker.com/2016/01/14/please-support-dr-nigel-speight/
  
  https://iacfsme.org/PDFS/Reporting-of-Harms-Associated-with-GET-and-CBT-in.aspx
  
  Reply ↓
Stewart on January 13, 2016 10:09 PM at 10:09 pm said:

Andrew – thanks for your interest in the PACE trial so far, and I hope that you’ll continue to investigate and write about this subject.

There have been lots of good points made in the comments already – which I won’t repeat – but I don’t think anyone’s mentioned yet just how weak and inadequate the PACE team’s explanations were for dropping/downplaying the objective measures from the trial protocol.

Their explanation for abandoning the use of actometers was that they felt this would be “too great a burden at the end of the trial” (1) – even though they’d already carried out baseline actometer tests at the trial’s outset. They downplayed the importance of the walking test results because they only had “10 metres of walking corridor space available, rather than the 30-50 metres of space used in other studies; this meant that participants had to stop and turn around more frequently, slowing them down and thereby vitiating comparison with other studies” (2).

They had nearly £5 million of funding and yet they couldn’t obtain the use of a longer corridor? They went to the effort of taking the baseline actometer data but then decided not to obtain the end data because it would ‘inconvenience’ participants? I don’t see how Wessely can describe the trial as a “landmark” or a “thing of beauty” – even if you were to take the PACE team’s explanations at face value you would have to conclude they come across as a bunch of rank amateurs who don’t seem to recognise the importance of objective outcome measures, and consequently abandon them far too readily.

1. https://bmcneurol.biomedcentral.com/articles/10.1186/1471-2377-7-6/comments – Peter White, 28 July 2008
2.https://www.meassociation.org.uk/2013/07/pace-trial-letters-and-reply-journal-of-psychological-medicine-august-2013/ – Peter White, Journal of Psychological Medicine, August 2013

Reply ↓
Kati D on January 14, 2016 1:25 AM at 1:25 am said:

Ask Dr Ranjit Chandra what he think about formula feeding for newbord and ask him about his research, and what you will find will be nothing extraordinary, because he is a man who will always defend his interests. Asking Simon Wessely what he thinks is the exact same thing as Ranjit Chandra. Conflicts of interest and protecting his assets including his reputation.

You can find out about Ranjit Chandra here : https://en.m.wikipedia.org/wiki/Ranjit_Chandra

We are 30 years into a disease which has been disallowed biomedical funding due to the gravitational pull from the british psychiatry lobby. Patients are and have simpy been discarded by their health care system because it is supposed to be ‘all in the head’.

‘Significant improvement in 6 minute walk test showed patients could actually walk 60 more metres after a 1year graded exercise therapy training program. So instead of being able to walk 300 m in 6 minutes, they were able to walk 360 m. Significant? Life altering therapy? Who are they kidding? Everybody but the patients.

Reply ↓
Andrew K on January 14, 2016 4:34 AM at 4:34 am said:

The study is unblinded, so it is quite likely that short term biases in reporting could effect the result. Specifically, patients may report changes in fatigue or better physical functioning without any objective change in their physical functioning or the impact of fatigue on their daily life. We know this from meta-analyses of CBT trials which used actigrapy and neuropsychological testing which showed no change.

https://www.ncbi.nlm.nih.gov/pubmed/20047707
https://www.ncbi.nlm.nih.gov/pubmed/17369597

These biases are why we insist on blinding in pharmacological trials.

If the initial reporting of symptoms was temporarily biased, then we would expect the difference to disappear after 2.5 years. That is exactly what happened.

The other BS is the recovery criteria – a valid comparison group for the SF-36 is not the working age population, but the healthy working age population which has a skewed distribution due to the ceiling effect and a mean and median PF score in the 90s and a SD of about 10. A score less than 80 (at follow up) in a person who is not elderly and frail, suggests they have poor health. When you have systematically excluded people with other health conditions from participating in the trial, it means that low score must be due to the illness you were trying to treat in the first place. Hence a score less than 80 cannot be considered ‘recovered’ – the original protocol required a score of 85, above which would suggest good health.

Reply ↓
Rosie on January 14, 2016 5:04 AM at 5:04 am said:

“Having said that, there were a significant number of patients who did not improve with these treatments. Some patients deteriorated, but this seems to be the nature of the illness, rather than related to a particular treatment. . . .”

Mmm so those that improved can be considered to have done so due to CBT and GET and those that didn’t … well it must have been the vacillating nature of the disease itself! Very scientific!

I’m seriously underwhelmed by this blog. We know Wessley has a syrupy tongue but I thought someone of Gelman’s stature would read past the ‘pillow talk’ and realise that there is something very wrong in the state of PACE data analysis. Sad to see this.

Reply ↓
S W on January 14, 2016 6:35 AM at 6:35 am said:

There is not much that I can add here as an ME/CFS patient with ten years under my belt and a Biology degree. The strength of critique from members of the community is rock solid and I doff my hat to the commenters here for their efforts again as usual, to clarify published issues. I also thank Andrew Gelman for his time spent reviewing this issue and commenting on this complicated area, although of course I was disappointed to read that some typical Wessely tactics helped to divert the attention from the fatal flaws of PACE. Its also difficult to appreciate the understanding patients have of Wessely having witnessed his comments and control over this illness (in the UK) for more than two decades, so that in isolation and superficially they might sound reasonable, but actually belie a hypothesis that is not only totally discredited but also has been diabolically harmful to the chances of ME/CFS patients who have suffered with stigma and lack of effective treatment for decades, in place of his bogus theories. Its really no joke, hundreds of thousands of peoples lives affected, each with not just days or weeks but years and often decades of suffering. THIS is the reason why good science is a MUST and MUST be applied and fought for.

Reply ↓
- Findlow on January 14, 2016 11:51 AM at 11:51 am said:
  
  Thank you for drawing attention to this:
  ” Its also difficult to appreciate the understanding patients have of Wessely having witnessed his comments and control over this illness (in the UK) for more than two decades, so that in isolation and superficially they might sound reasonable, but actually belie a hypothesis that is not only totally discredited but also has been diabolically harmful to the chances of ME/CFS patients who have suffered with stigma and lack of effective treatment for decades..”
  
  The perspective that long term patients bring to this whole shambles is unique. I’ve had ME for thirty years, and suggest that the NHS/NICE guidelines sanctioned “treatment” for this serious physical disease is as stupid and ridiculous as if the only available treatment for Type 1 diabetes was graded doughnut eating and addressing the patient’s “false beliefs” about consumption of sugar – oh, whilst ignoring a huge body of medical evidence about failed islet cells and the role of insulin…
  
  For people new to the whole subject, the following blog draws attention to Margaret William’s piece “Wessely, Woodstock and Warfare?” which is a very useful historical background that devastatingly reveals some of the murky politics involved.
  
  https://blacktrianglecampaign.org/2012/09/09/dwpunumatos-scandal-professor-simon-wessely-it-is-only-human-for-doctors-to-view-the-public-as-foolish-uncomprehending-hysterical-or-malingering/
  
  Reply ↓
Graham McPhee on January 14, 2016 7:36 AM at 7:36 am said:

Thank you for entering this debate, Andrew: I wonder if you anticipated getting so many responses.

I’m an ex-Head of Maths who now has ME, and approach the PACE trial in a much more basic way. In their recovery paper of 2013, PACE claimed that 22% of patients in the CBT group recovered – and they no longer satisfied the operationalised criteria for CFS/ME (or 21% using all three of Oxford, London or International criteria). If you get a chance to talk with any of the PACE authors again, ask them whether any or all of those recovered could still be diagnosed by a specialist as having CFS/ME. The answer is shocking. All that it needed to negate the operationalised diagnosis is for a patient’s score on the sf-36 questionnaire to rise a little to 70 (see the top right paragraphs of the third page of the recovery paper). That’s all. The patient’s general symptoms could be the same or even worse, and there could be no change in any of the objective assessments, but if a couple of boxes were ticked in a different way, that would be enough.

The “threshold” of 70 was purely arbitrary and is not part of normal diagnosis. It is simply the notch above the entry criteria, which itself was changed to increase recruitment. A healthy adult of that age would expect to score 95 or 100. Ironically, if a patient with that score, and “recovered” according to PACE, were able to travel back in time, she or he would still qualify as being sufficiently ill to enter the FINE trial, the sister trial to PACE and aimed at patients unable to attend ME centres. At least 300 people diagnosed with CFS were not accepted on the PACE trial because their scores were 70 or more. It could be argued that being refused entry to the PACE trial created the most successful group, because, with scores of 70 or more, none of them would fulfil the operationalised criteria of having CFS.

This total misunderstanding of what was meant by “recovered” is widespread. In a discussion in parliament, Baroness Hollins, a professor of psychiatry, said “My understanding is that, by “recovery”, the researchers included the fact that after a year these patients no longer met the criteria for CFS/ME. This would be heralded as a fantastic outcome for the treatment of MS, Parkinson’s disease or cancer.” But of course, it didn’t mean that at all.

The problem is that the dataset in PACE is so complex, it is easy to get bogged down in the detail. The key question is “What message did the PACE trial convey to the media in general and to the medical profession in particular?” I would argue that it was a very deceptive one.

Reply ↓
- Andrew on January 14, 2016 10:14 AM at 10:14 am said:
  
  Graham:
  
  One of the things I was trying to wrestle with in my post was the distinction between what was found in the PACE study and what was claimed. It seemed to me that much of the criticism of the PACE investigators was that they overstated what they found. Is it possible that there were small improvements on the continuous-scaled measure, but not the “recovery” that was stated or implied at various times? That would be consistent with the idea that these treatments could be helping some people (after all, CFS is a broadly-defined syndrome, so it should be no surprise that CBT and physical therapy could help some of them) but without the clear resolution implied by Richard Horton.
  
  Reply ↓
  - S.T. on January 14, 2016 11:41 AM at 11:41 am said:
    
    Andrew, you said, ” It seemed to me that much of the criticism of the PACE investigators was that they overstated what they found. Is it possible that there were small improvements on the continuous-scaled measure, but not the “recovery” that was stated or implied at various times?”
    
    I think there are three areas of criticism of PACE that relate to this issue.
    
    (1) Post-hoc outcome-switching: the PACE authors, with knowledge of the data, came up with ludicrous “normal ranges” for self-rated fatigue and physical function and used those inflationary analyses instead of the planned ones.
    
    (2) They presented the findings as “recovery” as opposed to the more accurate “pretty much nothing”. This has been catastrophic for patients, who are on the receiving end of these potentially very harmful (for people with ME/CFS) treatments.
    
    (3) There were small improvements for CBT and GET on the continuous measures but these are self-ratings of fatigue and physical function, not objective measures of them – and the objective findings contradict these improvements. In such a case, it simply makes no sense to treat the improvements as real. The way the study was conducted would have put huge social pressure on patients in the CBT and GET groups, but not the others, to present themselves as having improved.
    
    Reply ↓
  - Adrian on January 14, 2016 12:03 PM at 12:03 pm said:
    
    I don’t think the scales they used should have been considered as continuous scales. They are the sum of questionnaire answers and probably don’t form linear continuous scales.
    
    Also what they measure is basically a subjective view on symptoms and abilities but two of the four treatment arms were aimed at trying to change patents perceptions of symptoms and abilities so they are not good measurement systems for the treatments.
    
    What would be interesting is to correlate the changes in questionnaire scores with changes in the more objective measures. I would hope that if the trial results as published were reliable then they should more or less agree. However, this data has not been released so we will never know. But where there are summary stats the more objective measures don’t seem to back the improvements in questionnaire scores.
    
    Reply ↓
  - Valentijn on January 14, 2016 12:34 PM at 12:34 pm said:
    
    It’s unlikely that CBT and GET helped any of the patients, beyond SMC, even in the short term. And with the recently published long-term followup, we now know that there was 0 benefit for CBT or GET after 2.5 years when compared to the control arm.
    
    They’ve spun this to say the supposed earlier gains were maintained, ergo CBT and GET are supposedly still a success. But that sort of interpretation completely negates the benefits of having a controlled trial in the first place.
    
    Reply ↓
  - Moss on January 14, 2016 12:42 PM at 12:42 pm said:
    
    “Is it possible that there were small improvements on the continuous-scaled measure, but not the “recovery” that was stated or implied at various times? That would be consistent with the idea that these treatments could be helping some people (after all, CFS is a broadly-defined syndrome, so it should be no surprise that CBT and physical therapy could help some of them) but without the clear resolution implied by Richard Horton.”
    
    —–
    
    All but one of the more objective measures used in PACE reported no benefit, the exception being the 6MWD test for the GET arm, which still didn’t reach clinical significance and left patients scoring down among the sickest of disease groups medicine deals with.
    
    (In an interesting twist, the PACE team have since downplayed and effectively disowned the 6MWD results, the only objective test on which they scored any statistically significant result.)
    
    A very modest improvement in self-report measures, after a year of patients being primed to report an improvement on those measures, plus a practically meaningless increase for one arm on one objective measure, all coming off a very low baseline, cannot be honestly described as a safe, robust, clinically meaningful result.
    
    That is before we take into account methodological issues, and especially the critical null result in the 2.5 year follow-up paper, which I strongly suggest on its own renders all previous claims about the PACE results null and void, and largely disproves the cognitive-behavioural model being tested by PACE.
    
    Patients and PACE critics are not the problem here.
    
    Reply ↓
  - Valentijn on January 14, 2016 12:45 PM at 12:45 pm said:
    
    I would also add that ME/CFS is only a “broadly-defined syndrome” when certain therapists decided that they wanted it to be. In PACE they used the Oxford criteria, which is literally only debilitating chronic fatigue without an identified cause. Every other definition at least lists Post-Exertional Malaise (objectively verifiable via 2-day CPET) as a symptom.
    
    Patients, their doctors, and their advocates overwhelmingly favor PEM as a required criteria for diagnosis (or at least for research). Even the UK does not officially endorse the ridiculously lax Oxford definition, yet the PACE authors continue to contend that it’s applicable to more strictly defined ME/CFS patients.
    
    Using such a broad criteria allows these researchers to recruit patients who often do not have ME/CFS at all, and “prove” that CBT/GET subjectively cures them of their subjective fatigue. Then they can tell insurance companies and the Department of Work and Pensions that all we need is a little tough love to become well again, and to quote Wessely, “Benefits can often make patients worse.”
    
    Reply ↓
  - Leveller on January 14, 2016 2:37 PM at 2:37 pm said:
    
    “Is it possible that there were small improvements on the continuous-scaled measure, but not the “recovery” that was stated or implied at various times?”
    
    Yes (with noting Adrian’s concerns above about this description of the questionnaires used). And the same is true for a wide range of quackery assessed with a nonblinded trial, from aversion therapy for homosexuality to homeopathy for all sorts.
    
    “That would be consistent with the idea that these treatments could be helping some people (after all, CFS is a broadly-defined syndrome, so it should be no surprise that CBT and physical therapy could help some of them)”
    
    It would also be consistent with CBT/GET being time consuming and worthless for patients in a trial setting, worse than that outside of trials, and for their promotion as evidence based treatments to lead to further social problems for patients too. The evidence from the PACE trial does not let us say whether or not CBT or GET genuinely helped a small number of patients or not, but it does let us say that the researchers have misrepresented their results.
    
    If people want to spend their time and money on homeopathy, or anything else, then that should be up to them; but informed consent is important. Lots of CFS patients are engaging in CBT/GET, finding that it is unhelpful, and then later becoming aware of how the evidence has been misrepresented to them. This is feeding in to the culture of animosity and distrust which now surrounds the condition, particularly in the UK.
    
    The possibility that CBT/GET may help some patients does nothing to justify the behavior of the PACE trial’s researchers.
    
    Reply ↓
  - Graham McPhee on January 14, 2016 7:39 PM at 7:39 pm said:
    
    The problem is that I’m not looking at it as an academic exercise: I’m looking at it from the sharp end, and believe me, with poor quality psychological studies overwhelming any biomedical findings, this end really is sharp! Far, far more public money has been spent on this one psychological trial than has been spent on biomedical research into the illness over the last 25, possibly 35 years. It is so hard not to get worked up about it.
    
    I do understand your perspective and utterly support your wish to understand this. I am pleased that you have opened up the debate and continue to push us to give you satisfactory answers. But suppose I decided to produce a cure/treatment for epilepsy, but first of all redefined epilepsy so that the number of people satisfying the criteria was more than doubled. Then my treatment was found to have a small positive effect on a minority of my new group. Would you accept it as a valid treatment for all patients with a diagnosis of epilepsy? What if you found that it all relied on two subjective questionnaires, one of which was only designed for and used for my newly defined group, and that no objective evidence could be produced to show any improvement?
    
    I have looked in depth at the Chalder Fatigue Questionnaire and its scoring systems, using results from an informal study of 123 people with ME, and now with the results from the FINE trial. Patients misinterpret the scoring system (especially the zero, and the “comparison with when you were last well” instruction), inflating results. ME is a variable illness, yet cutoffs were used to select those who were more severely affected at the time, meaning that reversion to the mean is a potential problem, again inflating results. A significant proportion of patients are crammed right at the bottom of the scale, unable to register any deterioration, so random improvements cannot be balanced by random deteriorations. All these aspects force the results upwards. The Chalder Fatigue Scales have not been robustly tested outside the ME community, and have hardly been tested within: there has been no assessment of individual variability over time, or of patient difficulty in completing the questionnaires as intended. There has been no study on the effect of the low score cut-off. The problem can be reduced but not eliminated by focusing on the differences between groups, which is what the study was designed to do, but most of the claims made by the authors for effectiveness quote absolute measurements and count individual improvements. From my perspective, there are too many uncertainties for such measures to be robust enough to reflect the small improvements measured by individuals.
    
    Does the PACE data look like this? Who knows? Until it is analysed by independent people, I cannot have any faith in it. I’m not an academic: I’m not a researcher. But I do know that if a student had presented me with a draft project of this standard, I would have torn it up, insisted on a fresh start, then gone home and worried about what a mess I had made teaching this subject.
    
    Reply ↓
    - Andrew on January 14, 2016 9:15 PM at 9:15 pm said:
      
      Graham:
      
      I completely agree with you that the data should be made public and available to independent researchers.
    - David E on January 19, 2016 2:35 PM at 2:35 pm said:
      
      the sharp end is often the best end to analyse from. The so called ‘independent reviewers’ appointed by wessely will always find what he wants them to find. Why is it so difficult to analyse and observe the clear breaches of good scientific research guidelines and principles and ethics in the PACE trial ? are some people trying too hard to please wessely or not to displease him. Look again
      http://www.me-ireland.com/bogus.htm#pace
Amy Kritz McLaughlin on January 14, 2016 12:31 PM at 12:31 pm said:

With respect to your comments on the PACE trial, I would like to clarify several basic points which you have, I believe, misunderstood. Considering the complicated and intensely politicized history of this disease, and what has been deliberate obfuscation, that’s not surprising.
I have had Myalgic Encephalomyelitis for nearly 30 years, falling ill abruptly on May 26, 1987. As an avid equestrian who enjoyed many other activities as well, it would take a lot to stop me, and if graded exercise could have restored me, I would have figured that out back in the 1980’s, all by myself. I never simply took to my bed; like most patients I know, I fought to continue functioning normally. This is a catastrophic, overwhelming disease that defeats very determined people.
You refer (understandably, given its wide use) to ‘Chronic Fatigue Syndrome’, a meaningless construct with diagnostic criteria so broad it catches a huge range of diverse conditions, including depression. The criteria for M.E., on the other hand, are very specific and clear. The first is a heterogenous group; the latter is not. The disease was described perfectly by Dr. Melvin Ramsay decades ago, based on close observation of patients during epidemic outbreaks. Frankly, there is no excuse for the continued ignorance of so much of the medical profession. I say this as a doctor’s daughter, who grew up expecting physicians to be compassionate, intuitively good clinicians, and intellectuals. What is not ignorance is active malfeasance, and morally reprehensible. This is true of the sly dissolution of the PACE trial, a farce that has taken in almost everyone but those of us who are actually sick.
You remark that it’s perfectly reasonable that exercise and therapy will help people feel better. This modern creed has taken hold so firmly that it seems to be the demarcation line where otherwise intelligent people stop thinking. Dr. Paul Cheney, one of the two doctors who reported to the CDC the outbreak of an unknown disease in the Lake Tahoe area in 1984, has said, “The whole idea that you can take a disease like this and exercise your way to health is foolishness, it’s insane.”
The central features of M.E. are cardiac insufficiency, inflammation of the brain and spinal column, and an outright immune deficiency (reduction or even complete loss of NK function) and many immune abnormalities. What educated person, understanding even just those headlines of the pathophysiology of this disease, would be unable to grasp that exercise might not only be incredibly hard, but in fact dangerous, for such patients? You think it common sense that encephalitis patients would be helped by exercise? Really?
I am separately sending you an index of some of the nearly 10,000 peer-reviewed, medical journal articles documenting the abnormalities in M.E, as well as the International Consensus Criteria for the disease. I had trouble attaching them here.

Amy Kritz (McLaughlin)

Reply ↓
- Amy Kritz McLaughlin on January 14, 2016 5:46 PM at 5:46 pm said:
  
  This is the link to the International Consensus Criteria. https://www.facebook.com/l.php?u=http%3A%2F%2Fonlinelibrary.wiley.com%2Fdoi%2F10.1111%2Fj.1365-2796.2011.02428.x%2Ffull&h=HAQHXdALr
  
  Reply ↓
TS on January 14, 2016 2:42 PM at 2:42 pm said:

How can deterioration after graded exercise be just ‘the nature of the illness’when so many of the 25% Group for severe sufferers report being made severely disabled by graded exercise?

Why are the experiences of these many patients being brushed aside?

Reply ↓
K on January 14, 2016 2:55 PM at 2:55 pm said:

Thank you again Andrew for wading into the discussion.
With all due respect I do think you are missing a few crucial points in evaluating the results of this trial.

Namely, in regards to your assumption that CBT and GET “should” help at least a subset of patients.

while it may seem common sense to someone unfamiliar with the biomedical literature on ME that CBT and exercise ‘should’ help at least some patients, The defining feature of the illness is a pathological and out-of-proportion exacerbation of symptoms following exertion. This is perhaps the area of ME&CFS research with the most robust and reproducible findings, including abnormal in-vitro muscle biopsies, abnormal 2-day CPET results, abnormal gene-expression after exercise challenge etc.

A good deal of the debate around these therapies hinges on the seemingly common-sense idea that even if they are not effective they must be essentially harmless. Unfortunately there is a great deal of evidence showing that is not the case. Every real-world evaluation of these therapies within the community has shown the majority of ME&CFS patients report worsened symptoms following GET. Including many saying they were rendered permanently bedbound or wheelchair-dependent. The recent MEA report includes dozens of (admittedly anecdotal) reports to that effect, as well as an overall finding that 74% were made worse by GET, and the vast majority reporting either no effect or worsening from CBT.
Every study of GET/CBT for ME&CFS which has collected objective measures has shown a negative or null result for GET, or has dropped the objective measures before the study was published. literally every single one. I would challenge anyone to try to find a statistically-significant, positive, objective measure in ANY study of CBT/GET for ME/CFS.
I think someone mentioned earlier in this thread the Belgium CBT/GET program that was discontinued. This was as a result of the objective measures they collected, which showed increases in objective disability measures (increased unemployment, benefit claims, etc) in those who had undertaken the GET/CBT program.

The PACE trial used a non-standard definition to select patients (the Oxford criteria), which might indeed select a rather heterogenous group of patients, who may, as you say, not all have the same condition. This is in fact why patients, charities and researchers were voicing objections even before the study started. There are a few problems here – even if they HAD gotten good results from GET/CBT, there is no indication that this can be safely expanded to patients who selected with different more stringent criteria.
Beyond that, the PACE trial started with a pool of over 3000 patients, from these patients the investigators weeded out about 2500 . So the question becomes, why? Considering the definition they used required only unexplained fatigue, and in fact dispensed with both the main symptom of the disease (PEM) and the autonomic, neurological and endocrine abnormalities required in most other definitions. Did the investigators believe that thousands of patients referred to the trial were misdiagnosed? less likely to respond to the therapies? It becomes a complete muddle to figure out what disease the researchers intended to study, and who their results could reasonably be applied to.
ME&CFS may indeed be a heterogenous set of diseases, but a lot of this comes down to definitions. and a good deal of the heterogeneity was introduced by the PACE investigators when they created the much broader Oxford criteria, criteria which US government reports (p2p and IOM) have strongly urged should not be used for research as they are likely to “impair progress” and “cause harm”. Certainly the results get much more consistent with more stringent criteria. A recent meta-analysis of NK-cell function compared the results across dozens of studies of ME&CFS and found that the results were consistent and reproducible when stratified by disease-definition – so much so that they suggest it could be a biomarker. The Rituximab trials have found 60% responders using the Canadian Consensus Criteria to select patients. So the disease may not be all that heterogenous when it is properly defined. The problems are introduced when certain groups want to widen criteria rather than restrict or subgroup.

So considering all of these factors, the wide definition, how incredibly stacked the deck seems to be, and the post-hoc changes to all the outcome measures, how did this trial STILL not produce results?
The type of CBT and GET used in this trial are not standard CBT and GET, they are a form of these therapies designed specifically to encourage patients to ignore their symptoms and continue even if they seem to be deteriorating. (The trial manuals and protocol are available online and are very clear as to this methodology should you want to look it up). If there is one robust finding from the PACE trial, I would say that it is that bullying patients into not being sick doesn’t actually work for ANY disease.

Reply ↓
- Steve W on January 16, 2016 6:56 AM at 6:56 am said:
  
  exactly – it is amazing even with the massive contortions made as you describe – despite all of them – the trial still failed to produce any results. That says it all.
  
  Reply ↓
Amy Kritz McLaughlin on January 16, 2016 6:54 PM at 6:54 pm said:

I would like to just offer a little follow up to my comment the other day on your blog. I’ve just emailed you a list of medical journal articles, admittedly selective, because there are now thousands. I have chosen several that most directly address the issue of ME patients’ exercise capacity, and whether or not it relates to an organic disease (supported by these articles) or ‘false illness beliefs’, as claimed by Dr. Wessely and his colleagues.
Dr. Wessely has been consistently and, it seems to those of us who are so very ill, gratuitously, hostile to ME patients. Some years back, he remarked, ‘Symptoms include muscle pain and many somatic symptoms, especially cardiac, gastrointestinal and neurological. Do any of these symptoms possess diagnostic significance? The answer is basically negative…the description given by a leading gastroenterologist at the Mayo Clinic remains accurate. ‘The average doctor will see they are neurotic and he will often be disgusted with them’.”
Yet after a recent study using Rituximab, a cancer drug, showed very encouraging results in ME patients, Dr. Wessely said, “The belief that CFS is all in the mind has been around since the beginning. It’s tragic that it might take a study like this to take sufferers seriously.”
The tragedy is that Dr. Wessely and others who have glibly psychologized this very serious organic illness, have an enormous responsibility for blocking the sort of real research that might have resulted in useful treatments years ago.
Thank you for your interest in the PACE Trial.

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Pro-PACE, anti-PACE

78 thoughts on “Pro-PACE, anti-PACE”

Leave a Reply Cancel reply