Death of the Party

Posted on February 4, 2017 9:08 AM by Andrew

Under the subject line, “Example of a classy response to someone pointing out an error,” Charles Jackson writes:

In their recent book, Mazur and Stein describe the discovery of an error that one of them had made in a recent paper writing: “Happily, Bartosz Naskreki spotted this error . . .” See below for full context.

That is from page 129 of Prime Numbers and the Riemann Hypothesis, Barry Mazur and William Stein.

See how easy that was?

Are you listening, ~~himmicanes people?~~ ~~fat-arms-and-voting people?~~ ~~ovulation-and-clothing people?~~ ~~ovulation-and-voting people?~~ ~~air-rage people?~~
~~Satoshi?~~ ~~Daryl?~~ ~~John?~~ ~~Roy?~~ ~~Marc?~~ ~~David?~~ ~~Amy?~~ ~~Andy?~~
~~Weggy?~~
~~Brian?~~
Bueller? Bueller?

49 thoughts on “Death of the Party”

Shecky R on February 4, 2017 9:34 AM at 9:34 am said:

You left out Sean Spicer…

Reply ↓
- Andrew on February 4, 2017 11:30 AM at 11:30 am said:
  
  Shecky:
  
  Oh, if we get started on politicians we’ll never stop. In politics and business you’re supposed to lie and cheat where appropriate; that’s part of the game. Science and journalism are supposed to be a different story.
  
  Reply ↓
Jordan Anaya on February 4, 2017 10:12 AM at 10:12 am said:

My blog post detailing more errors is taking a while since there are a lot of errors to wade through, but I wanted to give you a small taste (pun intended again).

Take a look at Table 2 from “Bad Popcorn in Big Buckets”: https://www.ncbi.nlm.nih.gov/pubmed/16053812, the publication which seems to have started this whole container size movement thing.

You can view the table here: http://imgur.com/a/kjvfn

An entire column is mislabeled!

And no, I’m not talking about the fact that the “Freshness” label is missing, the “Container Size” label should be moved one column over to the left. The “Container Size” column is actually the “Freshness” effects. With how the table is presented they make it appear that the size of the container is more important than the freshness of the popcorn when it comes to taste and quality.

Interestingly, if you go to this version of the paper: http://smallplatemovement.org/doc/big_popcorn_buckets.pdf the table is correctly labeled.

In addition some of the test statistics are wrong and the degrees seem to wrong.

I don’t know what I’m seeing, but I can’t look away.

Reply ↓
- Carol on February 4, 2017 4:41 PM at 4:41 pm said:
  
  Jordan Anaya: I was able to pick up a preprint version of the “Bad Popcorn in Big Buckets” article. I note that there are suppression effects in Table 3; these would not be errors.
  
  Note that in the regression of consumption on container size, freshness of popcorn, taste, and quality, the standardized regression coefficient for freshness is larger than it is in the regression of consumption on container size and freshness. Ordinarily, it would be smaller.
  
  Also note that in the regression of consumption on container size, freshness of popcorn, taste, and quality, the standardized regression coefficient for taste is larger than it is in the regression of consumption on taste and quality, and the standardized regression coefficient for quality has reversed sign.
  
  If you want to know more, get my e-mail address from Andrew or from Nick Brown.
  
  Reply ↓
- Carol on February 4, 2017 4:56 PM at 4:56 pm said:
  
  Jordan Anaya: Re Table 2, it looks as though the published version dropped a column that was in the preprint version, which could be due to a journal copyediting problem that the authors did not catch when they read the proofs.
  
  Reply ↓
- Carol on February 4, 2017 5:11 PM at 5:11 pm said:
  
  Jordan Anaya: Table 2: ANOVA F values of 101, 2012, and 194. Really?
  
  Reply ↓
  - Carol on February 4, 2017 5:12 PM at 5:12 pm said:
    
    I meant 201, not 2012!
    
    Reply ↓
  - Jordan Anaya on February 4, 2017 6:00 PM at 6:00 pm said:
    
    Yes, those values are very large, but they did give them either fresh or 14 days old popcorn, so observing an extremely high effect is understandable. Nick and I have noticed various problems with the regression models that they use, but I try to limit myself to values which are mathematically impossible
    
    In regards to these ANOVA values, 2 out of the 3 values are mathematically consistent with the means and SDs they provide.
    
    Reply ↓
    - Andrew on February 4, 2017 7:06 PM at 7:06 pm said:
      
      Feeding people 14-day-old popcorn—that’s barbaric! How did they ever get that one through the IRB??
    - Carol on February 4, 2017 7:12 PM at 7:12 pm said:
      
      Andrew: The interesting thing is that Cornell’s IRB would not allow questions to be asked about ethnicity, height, weight, and income on that study, to protect confidentiality. (Why one might want income information in a study of stale popcorn is beyond me.)
Martha (Smith) on February 4, 2017 3:57 PM at 3:57 pm said:

I think I’ve mentioned before on this blog the response I got when I pointed out to a well-know mathematician a mistake in a book he had written:

“Oh, is my face red!”

Reply ↓
Carol on February 4, 2017 4:31 PM at 4:31 pm said:

Andrew: Who’s Bueller?

Reply ↓
- Carol on February 4, 2017 4:43 PM at 4:43 pm said:
  
  Never mind, Andrew. I got it.
  
  Reply ↓
Ben Prytherch on February 4, 2017 4:52 PM at 4:52 pm said:

Now let’s suppose that there was no logically deductive way of distinguishing prime from not prime numbers, and the most popular way of attempting this distinction was to calculate an easily manipulated probability that nearly everyone misinterprets, and then just for fun let’s say Mazur had gained popular notoriety from credulous journalists who found the implications of this number being prime to be exciting and newsworthy…

Reply ↓
Jordan Anaya on February 5, 2017 3:01 AM at 3:01 am said:

Brian Wansink has updated his Addendum II, making this the third addendum to his post:

“In the end, I think the biggest contribution of bringing this to attention (van der Zee, Anaya, and Brown 2017) will be in improving data collection, analysis and reporting procedures across many behavioral fields. With our Lab, a rapidly revolving set of researchers, lab and field studies, and ongoing analyses led us to be sloppier on the reporting of some studies (such as these) than we should have been. This past Thursday we met to start developing new standard operating procedures (SOPs) that tighten up field study data collection (e.g., registering on trials.gov), analysis (e.g., saving analysis scripts), reporting (e.g., specifying hypo testing vs. exploration), and data sharing (e.g., writing consent forms less absolutely). When we finish these new SOPs (and test them and revise them), I hope to publish them (along with implementation tips) as an editorial in a journal so that they can also help other research groups. Again, in the end, the lessons learned here should raise us all to a higher level of efficiency, transparency, and cooperation.”

I’m not sure what to think, but I do find it interesting that he’s convinced his lab will soon be proficient enough in these methods to write a publication about them!

One possibility is that he met with some statisticians and truly had a come to Jesus moment and will now be a model scientist from here on out.

But if this were the case he would realize most of the work he has published is likely wrong and he would warn people about trusting the results.

My suspicion is he realized the errors we found in his pizza publications are not limited to those papers, but also occur throughout his publications (he notes being “sloppier on the reporting of some studies”) and he is hoping if he finally acknowledges there is a problem people won’t do any more snooping. It’s too late, I’ve already snooped.

Reply ↓
- Martha (Smith) on February 5, 2017 3:48 PM at 3:48 pm said:
  
  In any event, this is a whole lot better than doing nothing — and hopefully will be a lesson to (at least some) other researchers to do better to begin with.
  
  Reply ↓
  - Andrew on February 5, 2017 4:01 PM at 4:01 pm said:
    
    Martha:
    
    Wansink seems to be following a strategy of getting ahead of the criticism while minimizing any acknowledgment that (1) many of his published findings could be pure noise and (2) his research methods are pretty much guaranteed to come up with meaningless yet statistically significant patterns, over and over again. For him to frame this problem as “sloppy reporting” is not quite right. Had those four studies been reported in a non-sloppy way, they still would be presentations of noise.
    
    The interesting decision point will come a couple years in the future. If Wansink’s lab members continue to design noisy studies, but now move to preregister all their research hypotheses, I think the stream of easy publications will dry up. It will no longer be possible for an eager student to walk in the door and squeeze out four papers from a failed study. This will really change everything. I doubt Wansink quite realizes this yet—I expect that he sees preregistration etc. as a bit of red tape that he can follow; he doesn’t catch that this has the potential to destroy the workflow which he has found so successful over the years. My guess is that when this finally hits home, he’ll try to find some way to continue with the p-hacking, perhaps by writing the preregistration plans vaguely enough, with enough researcher degrees of freedom, that they still will be able to find statistical significance from any experiment with enough effort. We’ll see.
    
    In any case, I agree with you (Martha) that this is a whole lot better than doing nothing.
    
    Reply ↓
- Carol on February 5, 2017 6:06 PM at 6:06 pm said:
  
  I met Brian Wansink years ago, when we were both associated with UIUC. I was even a subject in an experiment that he ran at a mini-conference there. (If memory serves, I ate too many M&Ms.) My impression then was that he was a creative, energetic, outgoing, and likable person, but not very detail-oriented — not the sort who’d spend the evening checking the accuracy gf his stats, for instance.
  
  Reply ↓
- Carol on February 5, 2017 7:10 PM at 7:10 pm said:
  
  I note that Wansink has responded today (2/5/2017) on his website to van der Zee’s (2/1/2017) point-by-point reasons why the data should be anonymized and released. See the comments section.
  
  Reply ↓
  - Andrew on February 5, 2017 8:08 PM at 8:08 pm said:
    
    Carol:
    
    As I wrote in my post, I see no reason why Wansink should feel compelled to share whatever data he might have from that experiment. Also, given Wansink’s various very strange responses in that comment section, I would not take anything he posts there at face value.
    
    Reply ↓
    - Carol on February 6, 2017 10:16 AM at 10:16 am said:
      
      You don’t see any reason, Andrew, but van der Zee and Brown do!
    - Andrew on February 6, 2017 10:25 AM at 10:25 am said:
      
      Carol:
      
      Yes. Just as I think Wansink should free to say no to the data requests, I also think Zee et al. should feel free to keep requesting the data.
    - Jordan Anaya on February 6, 2017 10:30 AM at 10:30 am said:
      
      Carol:
      Someone else tried to get the data as well, they described their experience here:
      https://pubpeer.com/publications/92B836EDBA3F705300E46467F6E4F5#fb116690
    - Carol on February 6, 2017 10:49 AM at 10:49 am said:
      
      Jordan Anaya: Thanks. Very interesting. The Cornell Office of Research Integrity and Assurance has deferred to the journal. I’ve never seen this before.
      
      On a few occasions, when I have requested data or other materials from the author(s) of an article, and the authors refused, I’ve contacted the Office of Research Integrity (which goes by different names at different schools) and the office has seen to it that I have gotten the data or materials. I wonder what makes the difference here? The fact that the studies were not federally funded?
A bloke for Finland on February 5, 2017 10:50 AM at 10:50 am said:

This reminds me of a case in Finland. Some researchers had got their interpretation of odds ratio wrong. This prompted Pertti Töttö and his friends to write about it in a journal, and they also analyzed some other incorrect interpretations of OR. Some of the critiqued ones were really apprehensive and tried to brush of the critique by just practically talking shite that had nothing to do with the actual problem, but I remember that at least one fella, Lauri Nummenmaa, just admitted that he got it wrong and fixed the mistake in his book. I remember him because I used his book to study classical statistics, and I think it is “the” book for many other Finnish people too.

A random fact: a few years ago there was a murder case in Finland that got lots of attention. At some point people were saying that the guy who got murdered (and his wife who was widely thought to be the culprit) was somehow entangled in satanism, because people had seen him borrow a book with a pentagram on the cover from the library. What it actually was was Pertti Töttö’s book “The return of the devilish positivism” (2000), which examines such problems as (if I remember correctly) how to put differences in numbers in context and are measures such as statistical significance enough to determine what is practically significant.

References:
I couldn’t find all the relevant texts, but here is at least the initial critical paper:
file:///C:/Users/Joni/Downloads/50456-1-43131-1-10-20150428.pdf

Reply ↓
- A bloke for Finland on February 5, 2017 11:08 AM at 11:08 am said:
  
  Ah crap, the link was to my hard drive to which, I hope, you don’t have access to. Well, if anyone is interested the title of the paper I was referring to was something like “Voiko Turkulaisten kirjoittamista artikkeleista yli 100% olla kvantitatiivisia?” (is it possible that over 100% of articles written by people from Turku are quantitative).
  
  Reply ↓
- Anders_H on February 6, 2017 5:28 AM at 5:28 am said:
  
  This is interesting, essentially the exact same thing happened to me last month. I wrote about it at https://andershuitfeldt.net/2017/01/25/odds-ratios-and-conditional-risk-ratios/
  
  Reply ↓
  - A bloke FROM* Finland on February 6, 2017 3:20 PM at 3:20 pm said:
    
    Hah, that’s weird indeed. But I think the thing is that “odds” as a concept is familiar only to the british and gamblers so it is really easy to screw up.
    
    *how in the fuck did I typo that two times in a row!
    
    Reply ↓
Jordan Anaya on February 6, 2017 12:44 AM at 12:44 am said:

I finished my blog post:
https://medium.com/@OmnesRes/the-donald-trump-of-food-research-49e2bc7daa41

It’s a little long, I couldn’t help but wax philosophical a bit. I hope you don’t mind me quoting you.

Reply ↓
eric robinson on February 7, 2017 7:22 AM at 7:22 am said:

I spotted something that I can’t get my head around and posted it to Brian’s blog. Maybe someone here can help too. Post below:

Brian,

I’m worried. This doesn’t add up –

Brian says ‘a non-coauthor Stats Pro is redoing the analyses’,

But Brian also said he couldn’t share the data because of his consent forms….

“The records of this study will be kept private. In any sort of report we make public we will not include any information that will make it possible to identify you. Research records will be kept in a locked file; only the researchers will have access to the records.”

If something along this line changes in the future, I will let you know.

So this non-coauthor Stats Pro isn’t one of the original researchers…. presumably he/she must have access to the data to redo the analyses. So… why not share whatever you share with the non-coauthor stats pro with Tim et al. and whoever else wants it?

Here is the solution. You can share the data.

Alternatively, your actions with the stats pro have invalidated your consent agreement with the participants in this study..

Which one is it? I’m confused about all of this.

Eric Robinson
University of Liverpool

Reply ↓
- Andrew on February 7, 2017 8:33 AM at 8:33 am said:
  
  Eric:
  
  It is possible that the researchers are not supposed to post the data publicly but can share it with individuals, one at a time, if they gain permission. Such rules exist. For example, I remember that to get access to state identifiers from the General Social Survey, we had to get individual permission and then the data could only be accessed in some sort of “clean room”; they would not give us a file.
  
  Beyond this, I can understand Wansink’s reluctance to share whatever data he has. Given the outrageous number of errors in his published papers, I’d guess that things would even more embarrassing for him if those tables were compared to the raw data.
  
  By sharing the data only with someone who he is contacting personally, Wansink can attempt to control the damage to his reputation by telling that statistician his side of the story and giving that statistician an incentive to match that story as closely as possible to the data. The mission, I assume, is for the statistician to eventually report that there were issues in record-keeping but once the data were analyzed correctly, none of the substantive conclusions changed, thus the papers of Wansink and his colleagues still stand. The model might be the statistician hired by Gilbert et al. in the notorious “the replication rate in psychology is quite high—indeed, it is statistically indistinguishable from 100%” episode.
  
  Given everything we’ve heard so far, I have little expectation that I would trust a report written by someone chosen and hired by Wansink, and I doubt this “Stats Pro” strategy will advance research. The very term “Stats Pro” is a bit of a joke and sounds more like public relations than science.
  
  If Wansink really did want to figure this out, he’d share lots of his data with some outside group—not chosen by, paid by, or associated with Wansink, or Cornell, or his close colleagues—and let them see what’s going on. But, again, at this point his incentives would seem to point him toward minimizing the release of any more information.
  
  From my perspective, the problem is that forking paths invalidates all the claims in those four papers, even if there hadn’t been 150+ errors. This was the “xkcd jelly bean” conversation that Wansink non-responded to on his blog. So even if the Stats Pro recalculates all the t-statistics etc and the significance levels don’t change much, the study is still dead on arrival.
  
  The only thing that could really work at this point, I think, would be a series of preregistered replications. But I really really really don’t think anyone should bother wasting their time on this, unless they truly have nothing better to do.
  
  Reply ↓
BoSelecta on February 7, 2017 3:05 PM at 3:05 pm said:

Am I the only one struck by how perfect Wansink’s responses are? I mean, I find the whole situation infuriating and I can’t wrap my head around how and why stuff like this happens, but at the same time the wording of his responses is so perfect… It’s as if a perfect response-generating AI was behind them.

Reply ↓
Carol on February 7, 2017 3:58 PM at 3:58 pm said:

BoSelecta: Along that line, I wonder why there has been dead silence from Wansink’s co-authors? Perhaps one would not expect a response from a student visiting from Turkey. But two of the pizzagate articles were first-authored by David R. Just, who is a full professor at Cornell.

Reply ↓
Jordan Anaya on February 7, 2017 9:21 PM at 9:21 pm said:

We received a response to our email to the Office of Research Integrity and Assurance. Tim van der Zee has shared our email and their response on his blog: http://www.timvanderzee.com/case-study-on-suspected-research-misconduct/

I also hope you guys got a chance to read my post: https://medium.com/@OmnesRes/the-donald-trump-of-food-research-49e2bc7daa41

Reply ↓
- Andrew on February 7, 2017 9:41 PM at 9:41 pm said:
  
  Jordan:
  
  Interesting story, and I can feel your frustration. But my response to Tim is the same as what I wrote earlier on this blog: I see no reason why Wansink should feel compelled to share whatever data he has from that experiment. I also see no reason why any of us should believe anything Wansink writes about it.
  
  These are data from a study which Wansink himself described as “flawed,” and the only research products from these studies are four hopelessly p-hacked studies which would be essentially useless (except, I suppose, for teaching purposes) even had it not been discovered that they had over 15 errors. There’s nothing there. Release of the data (or a discovery that there actually is no dataset) would, presumably, reveal even more errors.
  
  Reply ↓
  - Jordan Anaya on February 7, 2017 9:51 PM at 9:51 pm said:
    
    We suspect that there is indeed a data set because there is a video showing them at the restaurant:
    https://www.youtube.com/watch?v=9OzunhdW2Qk
    
    The date of the video didn’t seem to correspond to the timeline provided by Wansink, so we actually called the restaurant, and they told us Cornell had made multiple visits. I think the lab also attempted to run even more studies at the restaurant–I guess they really like doing studies there for some reason.
    
    I agree with everything you are saying, but Tim, Nick, and our other confidantes think we should continue to do our due diligence. If anything, it doesn’t look good for them to deny sharing the data.
    
    Another reason we believe the data exists is that they were perfectly happy to share it with us until we mentioned we wanted to use it to confirm some errors we had found.
    
    It looks like New York Magazine is going to publish a long post about the story tomorrow, so keep an eye out for that.
    
    Reply ↓
    - Andrew on February 7, 2017 10:05 PM at 10:05 pm said:
      
      Jordan:
      
      It makes sense that if they’d done successful (by their standards) experiments at this restaurant before, that they’d want to do more. It can be hard to set up good working relationships, so once you have one, you might as well keep using it.
      
      Also, it’s possible that there’s no single dataset. There could be lots of data scrawled on all sorts of forms, maybe they weren’t entered correctly into the computer, who knows? It does seem to be a mystery how they could’ve had 150 errors. So I can understand your desire to get to the bottom of things. It’s a puzzle, kind of the opposite of the Michael Lacour story. Lacour had an entire dataset that was consistent with his published paper, but the data were faked. In contrast, Wansink and his colleagues seem to have actually did their experiment, but it’s mathematically impossible for there to be a dataset that’s consistent with what they published.
    - Jack on February 8, 2017 2:16 AM at 2:16 am said:
      
      It doesn’t look good for you to do what you’re doing. I read your post and it’s just silly, if those are the errors you found out, please, this is not worth anybody’s time… and you clearly show a personal agenda because of your story. There’s no way you are being professional about this, your silly post title speaks by itself.
    - Andrew on February 8, 2017 6:45 AM at 6:45 am said:
      
      Jack:
      
      The whole Wansink thing’s a waste of time all around, yet we’ve all collectively spent many many hours reading those papers, staring at the numbers, and writing about it. In that sense it makes none of us look good! As scientists we should be out there making new discoveries, right? Or at least developing tools to allow others to learn about the world or improve their lives. Or, failing that, we should be out in the world enjoying ourselves, taking our families to delicious meals at Taco Bell or saving up our money for that dream bullfight-centered vacation. But instead here I am responding to a blog comment! My only justification is that the careful study of individual examples of junk science can, we hope, help us better understand the larger issues of science research and science communication.
      
      So, yes, I agree that Jordan’s post is over-the-top in a bit of an embarrassing way, and that his post title is silly—but maybe the reason this is embarrassing to you and to me is that here we are spending time commenting on these threads.
      
      Also it’s my impression that Wansink has had influence on policy—his recommendations are widely publicized in the news media, and he had a government appointment a few years ago—so purely from the standpoint of the public good on food/nutrition research and policy, Jordan could be doing a service. Yes, his post is a bit personal and his title is silly, but that’s part of the whole package, as the personal interest is a big part of what motivated him to dig into this case in the first place.
      
      Hey—I just spent 10 minutes writing this comment. Thus demonstrating my point. In this case, though, don’t take it as a waste of time on my part but as a desperate attempt to avoid doing my real work.
    - Jack on February 8, 2017 11:57 AM at 11:57 am said:
      
      I agree with you, I actually hope Jordan keep doing what he’s doing, but please Jordan, keep it professional, do not frame it as a personal vendetta.
    - Jack on February 8, 2017 12:13 PM at 12:13 pm said:
      
      You know what I take back what I said… the situation is getting so messed up that there’s reason to think that the polite approach I’m suggesting is better. This last Fiske post was the last straw, this is literally a war against bad science.
    - John on February 8, 2017 3:40 PM at 3:40 pm said:
      
      “that there’s reason to think”
      
      You probably meant “there’s NO reason to think”.
    - Jack on February 8, 2017 3:42 PM at 3:42 pm said:
      
      Yes you’re correct I meant there’s no reason to think.
    - Jordan Anaya on February 8, 2017 4:46 PM at 4:46 pm said:
      
      A blog post was just posted by someone who had a previous run in with the Wansink Lab:
      http://persuasivemark.blogspot.be/2017/02/science-first-communication-second.html
    - John on February 8, 2017 5:06 PM at 5:06 pm said:
      
      Keep up the good work Jordan!
    - Jordan Anaya on February 10, 2017 4:40 AM at 4:40 am said:
      
      Andrew and Jack:
      
      In writing you have to know your audience, and I know my audience very well.
    - Carol on February 8, 2017 10:38 AM at 10:38 am said:
      
      Jesse Singal’s article “A popular diet-science lab has been publishing really shoddy research” is up now on New York Magazine’s website: nymag.com/scienceofus
    - Carol on February 8, 2017 4:46 PM at 4:46 pm said:
      
      And now see this: persuasivemark.blogspot.be/2017/02/science-first-communication-second.html
      
      for another person’s experience with the Wansink lab.
  - Nick on February 8, 2017 6:00 PM at 6:00 pm said:
    
    Andrew, you wrote:
    
    >>I see no reason why Wansink should feel compelled to share whatever data he has from that experiment.
    >>I also see no reason why any of us should believe anything Wansink writes about it.
    That’s great at the level of the skeptical scientist (which is, by coincidence I’m sure, the title of Tim van der Zee’s blog). The problem is that 99.99% of people out there typically believe what (the media tells them that) “scientists have discovered” — as long, of course, as it fits into their personal prejudices and doesn’t seem to involve them having to question the merits of their last $1000+ purchase.
    
    This means that in practice, society has a problem with junk science, whatever lab it comes from and on whatever topic it focuses, because the 0.01% of people who read and question academic journal articles don’t get a look-in when it comes to national TV coverage.
    
    I think it’s correct to say that historically, monks and nuns got a free pass out of some of the things that society expected people to do because they were sacrificing a bunch of other stuff in order to live their monastic lives. Perhaps we need to consider ways to stop scientists double-dipping (i.e., mostly-government funding on the one hand, and personal fame and fortune via mass-market books and other products and services on the other). I am constantly amazed that psychologists working on Effect X can publish books entitled “Effect X: Ten Weird Tricks To Happiness” without having to declare their obvious massive disincentive to disprove Effect X (cf. Feynman) as a conflict of interest.
    
    Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Death of the Party

49 thoughts on “Death of the Party”

Leave a Reply Cancel reply