Another Wegman plagiarism

At the time of our last discussion, Edward Wegman, a statistics professor who has also worked for government research agencies, had been involved in three cases of plagiarism: a report for the U.S. Congress on climate models, a paper on social networks, a paper on color graphics.

Each of the plagiarism stories was slightly different: the congressional report involved the distorted copying of research by a scientist (Raymond Bradley) whose conclusions Wegman disagreed with, the social networks paper included copied material in its background section, and the color graphics paper included various bits and pieces by others that had been used in old lecture notes.

Since then, blogger Deep Climate has uncovered another plagiarized article by Wegman, this time an article in a 2005 volume on data mining and data visualization. Deep Climate writes, “certain sections of Statistical Data Mining rely heavily on lightly edited portions on lectures from Wegman’s statistical data mining course at GMU. In turn, those lectures contain ‘copy-and-paste’ material from a variety of sources, some partially attributed and some not at all.”` It looks pretty bad. And, as with the other cases of plagiarism, sometimes the small changes they made caused errors that were not in the original sources. Ouch!

One of the authors Wegman stole from was Brian Everitt. Couldn’t Wegman have just invited Everitt to be a coauthor of his article? To steal his work, that’s sooooo tacky.

John Mashey wrote to me:

See here for a 12-pager that highlights, not the plagiarism in the Wegman report, but some of the easiest-to-see problems that I think rise to falsification/fabrication, always much harder to explain, especially to a general audience.

This doesn’t go deeply into the statistical turf, where Deep Climate’s is a key reference. A chunk of it started with DC’s analysis of problems with the way they hacked Bradley’s tree-rings, and applies another visual display style to bring that out and explain to a broader audience.

I suspect this goes beyond the laziness of typical plagiarism, but when one finds a mass of plagiarized text, the embedded changes leap out.

I replied:

It’s frustrating when people can’t just be direct. If only Wegman et al. had just written that they had read Bradley’s report, then cited it, then explained why they disagreed. I’ve noticed this in other academic settings–not usually involving plagiarism, but a setting where researcher A disagrees with researcher B, but instead of citing and quoting B’s work, A will simply vaguely refer to what B is doing and then disparage it, without even the courtesy of a citation. What Wegman was doing was worse, as he was actually manipulating Bradley’s words, but it has that same annoying indirect manner. If you want to disagree with someone, I think it’s best to directly explain what you’re disagreeing with!

To which Mashey replied:

I agree, but I think that can be fine-tuned. I’ve read many an expert disagreement in various literatures, but I’ve always thought the paradigms were:

a) Expert A offers data, analysis and conclusions.
b) Expert B cites a) carefully, then says:
– The data was confounded, or was wrong, or there is new data.
– The analysis was wrong, the error bars are too small, etc.
– The data and analysis are OK, but the conclusions go beyond the data
– My model and analysis does a better job of accounting for the data.
– We have 2 analyses that both look pretty good, but significantly disagree, so either both are somewhat off, or one is more off and we need more study.

[For example, the long-term argument between the satellite temperature measurements and the ground stations as claimed by Spencer and Christy’s UAH analysis, which made warming go away, turned out in 2005 to be shown to be programming error, reversing a sign. The other that confuses people no end, are 1000-year temperature reconstructions, all valiantly trying to extract signal from noise, and non-experts see differences and assume they disagree … but of course, some reconstruct globally, some Northern Hemisphere, some a latitude band within NH … so of course they look different, as the experts all expect.]

This is the only case I’ve ever seen like this, especially in such a high-profile report. My [Mashey’s] hypothesis is:
– Wegman & Said had zero expertise in paleoclimate and were totally incapable of citing Bradley and arguing with him. Said’s 2007 talk admitted such lack of expertise.
– So they plagiarized to simulate expertise, I conjecture figuring that experts would skip over it, especially seeing cites of Bradley tables beforehand (albeit with ludicrous errors inserted in copying them) and a vague Bradley cite at end.
– They couldn’t just quote the relevant sections, because they had the “wrong” answers, and anyone who actually reads Ray’s book knows that it is mostly about the techniques for extracting signal from noise and coping with the various confounders.
– They couldn’t quote some sections and then change the conclusions without being totally obvious.
– So, they copy a bunch and then diddle it, with no quotes.

Not being an academic, there could many cases like this that I wouldn’t know about, but I think plagiarism+falsification like this seems rare. But I do think it goes beyond laziness, because it took editing to make these changes.

I don’t know any of the people involved–I don’t think I’ve ever even met any of them. And, while I’ve done work in climate reconstruction, social networks, and color graphics, I don’t consider any of these to be my primary areas of expertise.

But there’s something about plagiarism, whether it comes from Doris Kearns Goodwin or Frank Fischer or Edward Wegman, that just fascinates me. My main explanation of plagiarism is that it’s laziness, the desire to simulate expertise or creativity where there is none. I just get so frustrated that people feel they have to lie–especially prominent people such as Goodwin or Wegman. If you are writing something and you find something written by someone else that’s just too good not to steal, just credit them! You can write something like, “Scholar X gave the following excellent description of . . . ” Or, even, “Scholar X writes . . . but I disagree for the following reason . . .”

  1. Thanks!
    Of course, there actually more. It's hard to keep up with DC,so I have to keep redrawing summary charts. I did drop the NCAR (2007) case out as relatively minor and am running out of space.

    See Strange Tales and Emails, Appendix B.1, p.15. That has many of them. Of course, the Wegman Report alone had 35 pages with near-verbatim text from elsewhere,10 that DC found, 25 (easier ones) that I found.

    The current count is:
    16 cases (15 public), although that counts 2 for the 2 different course that led to the color paper, and 1 for the course that led to Wegman, Solka paper DC just wrote about. So subtract 3 if you like.

    Of the 15 that are publicly documented:
    4 are Wegman-supervised PhDs, including Said's.
    8 have Wegman as 1st author (incl 3 courses)
    2 have Said, Wegman, and sometimes others.
    1 is Said alone

    DC also mentioned the last one in the same recent post. That's Chapter 13 of Rao, Wegman, Solka, Eds, Handbook of Statistics, 2005. Needless to say, I would guess the chances of Rao having anything to do with this are ~0.

    So far, there are 7 distinct plagiarism chains, as once material gets pulled it, it seems to get re-used.

    Of course, no claim is made that this is a complete list, as some seemingly-related papers are hard to find.

