Fake data on the honeybee waggle dance, followed by the inevitable “It is important to note that the conclusions of our studies remain firm and sound.”

Posted on November 7, 2024 9:26 AM by Andrew

I hadn’t thought about bee dancing for a long time, when someone pointed me to this post by Laura Luebbert and Lior Pachter on a bit of data fraud in biology. Luebbert writes:

Four years ago, during the first year of my PhD . . . I was assigned two classic papers on the honeybee waggle dance: “Visually Mediated Odometry in Honeybees” (Srinivasan et al., JEB 1997) and “Honeybee Navigation: Nature and Calibration of the ‘Odometer’” (Srinivasan et al., Science 2000). Since I was not familiar with honeybee behavior, I decided to expand my literature review to other papers on the topic, including “Honeybee Navigation En Route to the Goal: Visual Flight Control and Odometry” (Srinivasan et al., JEB 1996) and “How honeybees make grazing landings on flat surfaces” (Srinivasan et al., Biological Cybernetics 2000). While reading these papers, I sensed something strange; I had the feeling that I was looking at the same data over and over again.

It turns out that she was seeing the same data over and over again, a situation that reminded me of the story of economist Bruno Frey, who published something close to the exact same paper five times (motivating our update of Arrow’s Theorem) in five different journals.

This bee-dance thing was worse, though, because the identical data were claimed to be coming from different experiments!

Luebbert continues:

I was deeply concerned by these findings and presented them at the journal club meeting using animations and overlays to show that the data was indeed identical . . . I had imagined that the response to my presentation would be concern and advice on how to report my findings. Instead, both within and outside of Caltech, the response amounted to little more than a collective shrug.

This upsets me but does not surprise me. After all, my own institution, Columbia University, which houses so much wonderful teaching and research, also never did anything about their cheating on the U.S. News ranking or their professor of surgery whose papers were flagged for suspect data. So, yeah, I guess no surprise that the bee-dancing subfield of biology is no better than my employer in this respect.

Luebbert and Pachter put the details in this Arxiv paper, “The miscalibration of the honeybee odometer.” Amusingly—or, I should say, horrifyingly—enough, their article was rejected by a different preprint server, bioRxiv, which told them that they should “reformat it as a research paper presenting new results along with appropriate methods used, rather than simply a critique of existing literature.” Also, Pachter writes: “we were told that our manuscript contained ‘content with ad hominem attacks,’ even though it was merely a factual report of the issues we observed with appropriate citations of the affected papers, with no attack on any people or specific persons.”

The use of the term “ad hominem” in scientific discussions

“Ad hominem,” like “disingenuous,” is an expression that people use when they have nothing to say. It’s an attack in the guise of a defense, a pseudo-sophisticated phrase that, in effect, is roughly equivalent to: “What you say is true, and I don’t want to face up to it, so I’m trying to sidetrack this open scientific discussion by making it personal.”

I looked at Luebbert and Pachter’s paper, and it indeed contains nothing even close to an ad hominem attack (from Merriam Webster, “appealing to feelings or prejudices rather than intellect” or “marked by or being an attack on an opponent’s character rather than by an answer to the contentions made”). This is really disgraceful on the part of bioRxiv. I’d be interested in seeing the complete text of the message they sent to Luebbert and Pachter.

So, yeah, the usual story. Bad science, nobody cares. Indeed, worse than not caring, it’s an active anti-caring in the form of active efforts to suppress criticism. When there’s a problem, the reaction is to shoot the messenger.

It gets worse

Mandyam Srinivasan, the bee-dance researcher who had the duplicate data in his papers, made the mistake of responding (here and here) to Luebbert and Pachter’s post, and the response is a doozy.

Srinivasan characterizes the data issues as “typographical errors and minor oversights,” which is ridiculous if you actually look at the specifics of all the problems in those papers.

He also writes:

It is important to note that the conclusions of our studies remain firm and sound, and have been replicated independently in many subsequent studies from other reputable laboratories.

It’s funny how the conclusions never seem to be affected by revelations of major data problems. It kinda makes you wonder why these researchers bother gathering and analyzing data at all, given that their errors never affect their conclusions. This is consistent with the apparent attitude of some scientists that they already know the truth, with all this experimentation, analysis, and writing-up of results being a tiresome bit of paperwork between the theory and the professional acclaim.

The most annoying thing

I was particularly annoyed by this remark from Srinivasan:

I am surprised (and disappointed) with the unprofessional manner in which the authors of the arXiv document (Luebbert and Pachter) have conducted their commentary of some work in my laboratory. The authors never contacted me personally with their queries, nor did they seek clarification.

Grrrrr. Srinivasan published some papers. When you publish a paper, it is public. It’s out there for anyone to read, and for anyone to criticize. Correctness of a published paper is the author’s responsibility. There’s nothing wrong with contacting authors with your queries if you find problems with their paper, but also no requirement. Similarly, if you read a paper and want to use its methods, there’s nothing wrong with contacting the authors with your queries, but there is no requirement to do so. If you only want your paper read by people who’ve contacted you personally, don’t publish it. And if you don’t want anyone to criticize it, don’t publish it.

Personally, I like when people find mistakes in my published work. I make mistakes all the time (see here and here)! If the people who find mistakes in my work, or who think they find mistakes, contact me personally, that’s great, but the most important thing is that the mistakes (or possibly the points of confusion) get out there. And then I can go back and issue corrections to my errors (for example here, here, here, and here), improve my work (for example here), and clarify points of confusion (for example here).

Hey, Fermat!

As part of his reply to Srinivasan’s comments, Pachter wrote:

Finally if there are many reputable laboratories that have independently replicated the results in the 10 papers we flagged, it would be great if you could post links to all their papers and relevant figures in a reply to this comment.

To which Srinivasan responded:

A full, detailed, point-by-point response to the Luebbert and Pachter document will be available soon.

In Srinivasan’s last reply, he did not seem to be able to find space in the comment window for any references to the claimed many subsequent independent replications from other [sic] reputable laboratories.

Let’s look at the replications.

I wrote the above post in July and scheduled it for November. I guess that Srinivasan’s full, detailed, point-by-point response should be available by then, so we can see what he said.

Actually, I’m guessing that biologists would be less interested in the point-by-point response and more in the claim that their conclusions “have been replicated independently in many subsequent studies from other reputable laboratories.” Talk is cheap; replications are real. And it’s my impression that in biology, results of interest really do get replicated.

It could well be that Srinivasan’s findings are real, and really have been replicated. Just cos there’s research misconduct, it doesn’t mean you’re not doing real science. Indeed, this is kinda the flip side of the “honesty and transparency are not enough” principle. Scientific research can be honest, transparent, and wrong (this happens when you study weak theory with noisy measurements); I guess it can also be error-ridden in its details while still being on the right track.

So I’ll be interested to see what Luebbert, Pachter, and researchers in the bee-dance subfield think after looking at the references that Srinivasan supplies to the many replications from reputable laboratories.

P.S. The authors did post their promised reply.

Here it is.

Here is Luebbert and Lior Pachter response to that response.

These links are also in this Pubpeer thread.

You can read these and draw your own conclusions.

27 thoughts on “Fake data on the honeybee waggle dance, followed by the inevitable “It is important to note that the conclusions of our studies remain firm and sound.””

Raghu Parthasarathy on November 7, 2024 10:09 AM at 10:09 am said:

I fully agree. I remember this story from when it came out in X posts, and being impressed especially by the grad student (Luebbert). I didn’t remember bioRxiv’s role in this and I’m disappointed by their weasely “policy.” More broadly, in all cases like this, I continue to think that a major problem is the funding agencies, who seem not to care at all about maintaining scientific integrity and investigating misconduct. The silence from NSF, NIH, etc., is more depressing than the laughable defense by Srinivasan.

Reply ↓
Adede on November 7, 2024 11:22 AM at 11:22 am said:

I take issue with the “ad hominem” section. While it’s certainly possible that some people *mis*use the term ad hominem if they don’t have anything substantial to say, the term ad hominem has a legitimate use.

You draw an equivalence between the term “ad hominem” and the statement “What you say is true, and I don’t want to face up to it, so I’m trying to sidetrack this open scientific discussion by making it personal.” But to me, “ad hominem” *describes* what has happened when someone is trying to sidetrack things by making it personal. Insults like “methodological terrorist” and “data Stasi” are statements that I would describe as ad hominems.

As you point out, the term “ad hominem” was being used in a way inconsistent with its dictionary definition. So, maybe blame Srinivasan or bioArxiv or whoever, instead of the term itself.

(They write “we were told that our manuscript contained ‘content with ad hominem attacks'”, but due to the use of passive voice the teller is unclear. It’s in the paragraph about bioArxiv, so it would be a reasonable inference that bioArxiv told them that, but we can’t rule out that some other party told them they were using ad hominem attacks. Still, my point remains: blame the misuser of the term (whoever that may be) rather than the term itself.)

Reply ↓
- Andrew on November 7, 2024 1:27 PM at 1:27 pm said:
  
  Adede:
  
  Fair enough. There is such a thing as an ad hominem response, just as there really are disingenuous responses. It’s just my impression that when these terms are used, they’re typically used inaccurately a way to attempt to delegitimize real responses. I think the accessibility of the term “ad hominem” allows it to be used in this distracting or defensive way.
  
  Reply ↓
- Andy W on November 7, 2024 1:49 PM at 1:49 pm said:
  
  I actually think it goes both ways, https://andrewpwheeler.com/2021/05/18/academia-and-the-culture-of-critiquing/. So people think legit critique is ad hominem sometimes, and actual ad hominem attacks (such as calling people racist) is legitimate.
  
  Reply ↓
Jessica Hullman on November 7, 2024 11:38 AM at 11:38 am said:

>When you publish a paper, it is public. It’s out there for anyone to read, and for anyone to criticize. Correctness of a published paper is the author’s responsibility. There’s nothing wrong with contacting authors with your queries if you find problems with their paper, but also no requirement. Similarly, if you read a paper and want to use its methods, there’s nothing wrong with contacting the authors with your queries, but there is no requirement to do so. If you only want your paper read by people who’ve contacted you personally, don’t publish it. And if you don’t want anyone to criticize it, don’t publish it.

Can’t be said enough.

Reply ↓
Anoneuoid on November 7, 2024 12:04 PM at 12:04 pm said:

Talk is cheap; replications are real. And it’s my impression that in biology, results of interest really do get replicated.

Unfortunately this impression is based on hearsay, and the evidence will stop there. When actually investigated in a more formal way, the replication rate is ~ 20-30%. This was mentioned in the 2022 thread:

https://statmodeling.stat.columbia.edu/2022/03/04/biology-as-a-cumulative-science-and-the-relevance-of-this-idea-to-replication/#comment-2046872

Here is how it works:

Study A is done with, eg, male rats. It is very exceptional to get funding to repeat this same experiment, with male rats. However, a “replication” study B can be funded if it is using female rats. If the results are the same, it is a successful replication. If the results are different it is because male and female rats are different. Replace male/female with age, dosage, some difference in technique, etc. That should give the general idea.

That is where this impression comes from and why it differs from what is observed by replication projects.

Reply ↓
Jay Patel on November 7, 2024 1:23 PM at 1:23 pm said:

Given that Srinivasan won major awards in his field, a good exercise for budding sleuths would be to analyze each awardee’s work for basic errors and signs of fraud. A larger-scale evaluation may push academia to change.

The current issue is that these cases pop up in isolated, small-scale ways that enable researchers to dismiss them. I’m not talking about Srinivasan here, but his peers and academia more broadly.

Reply ↓
- Andrew on November 7, 2024 1:25 PM at 1:25 pm said:
  
  Jay:
  
  I’m wondering about Srinivasan’s promised “full, detailed, point-by-point response.” Did he ever post that response?
  
  Reply ↓
  - Daniel Lakeland on November 7, 2024 2:21 PM at 2:21 pm said:
    
    Me too, this is the thing I was hoping our audience knew and would post this morning.
    
    Reply ↓
    - Andrew on November 7, 2024 6:31 PM at 6:31 pm said:
      
      I posted the links in the P.S. to the above post.
  - Joe on November 7, 2024 3:02 PM at 3:02 pm said:
    
    Is it here?
    
    https://arxiv.org/pdf/2408.11520
    
    and the response to that response is here (with asditional links)
    
    https://github.com/pachterlab/LP_2024/blob/main/response_to_rebuttal.pdf
    
    Reply ↓
  - Joe on November 7, 2024 3:08 PM at 3:08 pm said:
    
    I posted some links that I think are to the rebuttal but they are held up in moderation. If you follow the links to Srinivasan’s comments Andrew linked to, the last post has links a Github repo that contains Luebbert and Lior Pachter‘s response to the rebuttal, and the rebuttal has links to Srinivasan’s comments
    
    Reply ↓
    - Bob76 on November 7, 2024 3:48 PM at 3:48 pm said:
      
      I followed those links and found the document and Srinivasan’s comments.
      
      Those comments are weird. The figure here (https://www.semanticscholar.org/paper/The-miscalibration-of-the-honeybee-odometer-Luebbert-Pachter/9daf1f8c59928c5cfbb6c1e46ad86b821ad88b19/figure/4) overlays two plots. One has been rotated, after that rotation, they seem to line up essentially perfectly.
      
      Responding to this unusual isomorphism, Srinivasan’s response is:
      This claim is outlandish. For one thing, 8A and 9B show different relationships: 8A plots Height versus Frame number, while 9B plots Horizontal distance travelled versus Frame number. The vertical axes represent completely different variables. Why would anyone even contemplate flipping the data points tracking a variable in one curve upside down to try to match the data points in another curve that is tracking a different variable? We have certainly not manipulated or confused the data in these curves, and it is difficult to imagine why anyone would even try to manipulate data in this strange way.
      
      It seems to me that any reasonable author presented with the same evidence would respond, “Holy Cow! I cannot believe what I see. I did the work straight but that match seems highly unlikely under any reasonable assumptions! If I ever figure out any explanation for this match, I will let you know.”
Alfred on November 7, 2024 6:19 PM at 6:19 pm said:

I am not going to argue about the term “ad-hominem.” But the one time I discovered major flaws in papers published in top journals in my field, the author responded in several venues that my research was motivated by “personal malice.”

Which raises an interesting question. Do we only care about motivations when we can say they are bad?

Reply ↓
- Andrew on November 7, 2024 6:25 PM at 6:25 pm said:
  
  Alfred:
  
  Beyond that, it should be about the science. Even if criticisms are motivated by personal malice (which in the case of the above post do not seem to be), they still can be evaluated on their own terms. Recall the Javert paradox.
  
  Reply ↓
Woodpecker on November 8, 2024 12:27 AM at 12:27 am said:

Okay. I have been following this when it started.

I have worked with honey bees a few years back (not my project but helped collect data). My previous lab’s works were build on his Srinivasan’s works (similar research field)

L&P raised good points in their paper, and Srinivasan acted poorly to the allegations. There are mistakes in the original papers as one can see and what he calls as “typographical errors and minor oversights”. So three main issues

1) Wrong numbers and rounding up errors (duration calculation from frame rates?). I think someone who collected data messed up the data columns while analyzing. It’s possible when looking at hours of waggle dances and back calculating from videos
2) Same graphs in multiple papers. This is due to re-using the same control data in many papers and not describing the methods & variables thoroughly
3) Calculation of R2 from regression. Srinivasan said in his rebuttal, in some places it’s R instead of R2. In Other places, L&P didn’t account for things which were not mentioned in the original paper (given in rebuttal). I didn’t have time to do the simulations myself so can’t comment on that

Overall my take: there are many errors (typographical or minor & major oversights or whatever you call above) and lots of incomplete descriptions. I personally do not think Srinivasan or his students purposefully manipulated the data (I may be more optimistic though) but used incorrect scientific principles (don’t know if that’s the right word). I maybe missing/misremembering something but I can response if anyone has questions

And yes similar works have been replicated many times

Reply ↓
- Daniel Lakeland on November 8, 2024 7:13 AM at 7:13 am said:
  
  If we are talking about using the wrong columns, is this an analysis that was done in Excel? If so, further evidence that it’s a major mistake to use Excel. In theory, should be ok, but in practice from what I’ve seen people far too often make mistakes selecting columns by hand or typing formulas with the wrong columns or rows etc
  
  Reply ↓
  - Dale Lehman on November 8, 2024 8:23 AM at 8:23 am said:
    
    Spreadsheets don’t kill studies, people do. Well, spreadsheets can and do contribute to all sorts of errors. I teach spreadsheet modeling and am always amazed at the variety of different errors that students make – a constant surprise as yet a new error shows up. If we forced them to use a more appropriate tool, then most of my students just wouldn’t do any analysis. Perhaps that is for the best, but it isn’t my view. They must learn a lot of critical thinking in order to correct themselves, and I believe that learning helps train them to think more clearly and critically in more than just spreadsheet use.
    
    I don’t think the “major mistake” they make is to use Excel. It is more fundamental than that. Most people write poorly (myself included). Is it because I use MS Word? ChatGPT? Or is it because I am missing some important thought processes and the training to use these appropriately?
    
    Reply ↓
    - Daniel Lakeland on November 8, 2024 8:50 AM at 8:50 am said:
      
      My concern is that spreadsheets do stuff that makes it easier to make mistakes. Some examples…
      
      Everything is a grid, so if you’re trying to do computations on cells it’s easy to click the wrong row or column, perhaps an adjacent one.
      
      Everything is labeled as letters and numbers. Did you mean to compute with B12 or N12 but N is next to B on the keyboard?
      
      Data types are constantly converted under the hood, were you working with a gene named Mar5 or the date 2024/03/05
      
      Semantically meaningful stuff is often encoded in meaningless aesthetics. Yes, all the red cells are the ones in which you were unsure of the readout of the measurement instrument, but, that’s not easily actionable.
      
      There are sophisticated lookup functions which are a trap, you might think it’s a good idea to lookup the value in column C which is in the row where column A contains the value equal to the current value of the computed formula in the named cell poweroutput but this I promise you is a terrible idea and will likely ensure you have a bug that provides a perfectly plausible and wrong floating point answer.
      
      I’ve just seen a lot of Excel sheets written by Engineering graduates who have passed courses in Matlab, Calculus, mechanics of materials, PDEs and such and they always are a mess. There really isn’t a way to do the kind of sophisticated calculations these students and professionals want to do in a “safe” way in Excel. Eventually they often debug it enough to make it ok, but this just ensures it can never be allowed to be modified in any way. One person said they had a full year period to get a single cell added to an Excel sheet at a finance company because they had to get it through a regulatory body. Realistically that might be necessary because to validate the sheet requires Herculean effort. How do you know some deep calculation cell wasn’t referencing the value in that new cell? Relying on it to be blank and hence “equal to zero”?
      
      If you’re talking about people for whom the alternative is no analysis at all, ok, maybe but I’m talking about people who’ve had multiple semesters of highly technical topics and Excel is an enormous foot gun. Jupyter notebooks running R or Julia or Python are vastly preferred and not at all beyond the capabilities of these people.
      
      To stick with your firearm analogy, Excel is like a shotgun with a weak barrel that every so often explodes in your face. It really does kill people (eg when it’s used to calculate the sizing of bolts on hanging walkways over atriums and has opaque bugs that lead to undersized bolts or whatever)
  - Woodpecker on November 8, 2024 12:29 PM at 12:29 pm said:
    
    Yeah. Talking about both. More on putting data in column from observations and calculating from formulas using those frame rate. These are observations done by people (interns, grad students) looking at bees doing waggle dance (at least some of these papers, if not all) and writing down in a notebook, then transferring it to a excel sheet.
    
    Reply ↓
- Anonymous on November 12, 2024 11:36 AM at 11:36 am said:
  
  I’ve also been following this since it blew up, as I have close links to the field. As a long time reader of this blog who greatly enjoys reading about the debunking of dodgy data, I looked into it quite closely.
  
  Yes, there are mistakes in these papers, some of which (e.g. mislabeled axes, repetition of a figure instead of the correct figure, describing the conditions of collection of control data slightly differently when it is actually the same control data as a previous paper) seem so obvious that it seems rather sloppy of the authors (and, one could argue, subsequent readers!) not to have noticed over the many years in which they have been widely cited. In general, these “do not affect the conclusions” in the sense that the figure mistakes are not propagated through to conclusions drawn in the text (e.g. in the case of the repeated figure, the text actually describes differences between the figures, so clearly there was no intent to have the same figure repeated). On the other hand, being sloppy with some of your data is, reasonably, going to undermine people’s trust in the rest of your data. On the third hand, as Andrew has noted before, we all make mistakes and the best thing to do is to correct them, which does seem to be what has happened here.
  
  There are other parts of the L&P critique that seem to be disagreement over how a calculation should be done (e.g. for calibrating the dance odometer or regression calculations). It is perfectly reasonable to make such critique, but I think disingenuous of L&P to ‘sandwich’ the more serious allegation of data manipulation between these critiques, so as to create an impression of cumulative problems, when arguably these are neither errors nor misrepresentations. I think the authors’ responses show that they had some justification for how they did the calculations and have at least been transparent on how they were done.
  
  With regard to replication, I think this is exactly what L&P’s own figure 2 illustrates: multiple groups have shown (like Srinivasan) that there is a consistent relation between distance and waggle run duration. Estimates of the parameters (slope and intercept) differ (subsequent studies have shown these parameters can vary between bee colonies, individuals, subspecies, weather conditions etc.). Srinivasan’s estimates are not so discrepant as to be suspicious – the error here seems to be rather one of having made unjustified claims of precision (“17.7 degrees of image motion per millisecond of waggle dance”).
  
  This leaves only one part of the L&P document (fig 4) which has any relevance to the issue of “fake data” as used in the title of this post. I would expect for such an accusation the analysis should be thorough and convincing, such as the take-downs by Data Colada of the Ariely and Gino datasets. However, I couldn’t even fully understand exactly what the accusation was – that several plots of the relationship of two variables were suspiciously similar in shape, with closely matched (but not entirely overlapping) data points? Note that their 4B, which seems to show complete overlap, is actually an example of the same data (never claimed to be otherwise) which had mislabeled axes in one publication. For 4A, there is not complete overlap, and given these plots were presented as part of an argument that a specific causal process underlies the specific form of the observed relationship of the variables, then the similarity is not unexpected. Calling this “seemingly manipulated” (leading those outside the field to use terms such as “fake data”) does not seem justified.
  
  Reply ↓
  - Anonymous on November 17, 2024 3:04 PM at 3:04 pm said:
    
    What are your thoughts on the first point described in the rebuttal (https://github.com/pachterlab/LP_2024/blob/main/response_to_rebuttal.pdf)?
    
    Reply ↓
    - Andrew on November 17, 2024 3:14 PM at 3:14 pm said:
      
      Anon:
      
      I’m curious about the very first point in L&P’s rebuttal, where they write:
      
      In Srinivasan and Tautz’s response, they allege that while the mean and SEM values are similar, they are not identical. According to them, the values appear identical in the published figure (left panel above) because “the resolution of the image published by PNAS is inadequate.
      
      Moreover, they provide “a copy of the figure submitted to PNAS” (right panel above). Upon close inspection, it immediately becomes clear that the figures are not the same. The font size is larger in the figure published by PNAS (e.g. see the space between “T . test 3” and the edge of the bar), and 2-3 dashes between the bars become 5 dashes. These changes are not caused by a change in resolution, and unless PNAS decided to change the space between dashes in the figure, it raises the question of where this new figure came from. Especially because, according to Srinivasan and Tautz themselves, “the relevant datasets for the figure are no longer available.”
      
      The bit about the relevant datasets no longer being available, I agree that could make us suspicious. But the bit about the submitted and published figures being different . . . I don’t know what PNAS did in this case, but it’s not unheard of for fancy journals to reformat tables and figures, and to introduce errors when doing so.
    - Anonymous on November 19, 2024 4:57 AM at 4:57 am said:
      
      I agree with Andrew that it is not implausible that the production process (particularly 20 years ago) could result in some differences between a submitted and a published figure. At least, I don’t think this difference should be assumed to be duplicitous.
    - Woodpecker on November 20, 2024 5:01 PM at 5:01 pm said:
      
      I agree with Andrew and anon.
      
      If we assume that that’s not the case. It does become suspicious. The graph was not carefully made. If you see the paper in PNAS, it has a bunch of corrections along with y-axis variable.
Geoff Stuart on December 16, 2024 6:34 PM at 6:34 pm said:

The title of this blog post starts with the statement “Fake data on the honeybee waggle dance…”. I have written a detailed statistic critique [1] of the original arXiv article by Laura Lubbert and Lior Pachter on which this statement is based and have posted a link to it on PubPeer [2]. Luebbert and Pachter’s response [3] to my critique shows that they did not understand the basic statistical errors that compromised their simulations. They claimed that in my simulations I assumed that bees shared the same mean waggle duration. I certainly did not, the standard deviations reported by Srinivasan et al. [4] contain variation in mean waggle durations between bees and between dances and cannot be decomposed. They also claimed that I used the observed means reported by Srinivasan et al. as the starting point of my simulations. I did not, except to reproduce Leubbert and Pachter’s erroneous results. As explained in my detailed rebuttal, I used points on the line of best fit to simulate a linear regression model, as appropriate.

Although some papers from Srinivasan’s group contain errors, since corrected, Luebbert and Pachter’s own errors undermine their claim to have found evidence of more serious scientific misconduct.

I refer interested readers to the critique, but briefly, Science [5] quotes Lior Pachter as follows:

‘Six of the 10 papers have R-squared values—a value between zero and one that shows how well data fit a predicted trend—of about 0.99. These are “ridiculously high” correlations, Pachter says, and unlikely given the data were collected across a wide range of conditions’.

And further:

‘it’s “extremely unlikely” Srinivasan obtained the raw data that could have led to those outcomes in the first place’.

Pachter therefore implies that there is a high likelihood of scientific fraud or “fake data”. As I show in my critique [1], Luebbert and Pachter made elementary statistical errors that undermine their case. First, they confused the standard error of the mean of a set of observations with the standard deviation, as evidenced by this statement:

‘In examining the papers above, we noticed repeated presentations of regressions with R^2 ≥ 0.99 (Table 2). These extremely high correlations suggest that across studies and experiments, there was very little technical and biological noise’.

This is a fundamental misunderstanding. As given in Table 1 of Srinivasan et al., the standard deviations of waggle durations show substantial variation (third column of that table). The data points used in the regression model are means averaged over a large number of waggle durations, sometimes in the hundreds. This would provide a very accurate estimate of the underlying population values in terms of the standard errors of the estimates. However, in the simulation where Luebbert and Pachter claimed that only 0.6% of the simulation runs yielded an R^2 value greater than the value calculated from Table 1 of Srinivasan et al. (.997) they did not average over the reported number of waggle durations. Instead, they only simulated one waggle duration per bee. This of course attenuated the simulated R^2 values. When they used the reported number of waggle durations, as in the experimental design and analysis, 19% of R^2 values exceeded the reported value. This is still an underestimate [1].

The second major flaw in Leubbert and Pachter’s simulations was that they did not model the linear regression as a linear function plus sampling error (as given by the reported standard deviations). Instead, they used the sample estimates reported by Srinivasan et al. as their starting point. That is, they confused sample estimates with underlying population values. They therefore included the initial sampling error in all simulation runs, then added further sampling error, attenuating the resulting R^2 values. When both these errors are corrected, the majority of simulated values exceed the reported value. This assumes that the waggle durations satisfy the independence assumption, which is unlikely. However, if it is assumed that the number of independent observations is 28.2% of the reported numbers, 50% of simulated R^2 values exceed the reported value. This is a plausible value for the degree of statistical dependence between observations. For further detail see my recent post on PubPeer [2].

There were many other scientific misunderstandings and statistical errors in Luebbert and Pachter’s account. I would encourage anyone who believes that Srinivasan’s group produced fake data to read the detailed responses to Luebbert and Pachter’s non-peer-reviewed article on PubPeer and to avoid taking their assertions on face value without considering alternative explanations.

References

[1] Stuart, G.W. (2024). Miscalibration of simulations: A comment on Luebbert and Pachter: ‘Miscalibration of the honeybee odometer’ https://arxiv.org/abs/2408.07713

[2] https://pubpeer.com/publications/A8A29AC253893D0265039D250B82F5

[3] https://github.com/pachterlab/LP_2024/blob/main/response_to_rebuttal.pdf

[4] Srinivasan, M. V., Zhang, S., Altwein, M., & Tautz, J. (2000). Honeybee navigation: Nature and calibration of the “odometer.” Science, 287(5454), 851–853. https://www.jstor.org/stable/3074328

[5] https://www.science.org/content/article/buzzkill-accusations-leveled-research-dancing-bees-measure-distances

Reply ↓
MV Srinivasan on January 29, 2025 5:57 AM at 5:57 am said:

The use of phrases such as “fake data” and “research misconduct” to describe anybody’s scientific body of work is problematic; particularly so when it is based on hearsay and not from a thorough examination of scientific facts.

It is true that we made several inadvertent errors in our 1996 and 1997 papers published by the Journal of Experimental Biology (JEB). The 1996 paper was an invited review paper that covered a few of our preliminary studies on bee navigation. A part of this work on bee odometry was continued and the findings published as a full body of work in 1997. We acknowledged that we were using some of the findings from the 1996 review for this paper, so there was no undisclosed data duplication as has been alleged. The inadvertent mistakes related to erroneous numbers that we reported for the same experimental condition across the two papers. This gave the impression that the same data was being used to present different experimental conditions, which was not the case. We have been extremely transparent about this in the corrections that we supplied to JEB, and which the journal has published.

While we appreciate the discovery of discrepancies in our data, the errors were inadvertent, and not deliberate attempts at data falsification.

Two commentators on this blog, Woodpecker and Anonymous appear to be familiar with our area of research. Woodpecker has agreed that our studies have been repeated by other labs. Anonymous, who has provided a thoughtful and clear summary of the issues, seems to agree with our comment that the errors in the 1996 and 1997 papers do not affect the overall results and conclusions of the studies, which have been repeated several times in our laboratory as well as in other laboratories with similar results, as we have pointed out in our rebuttals on Pub Peer: https://pubpeer.com/publications/A8A29AC253893D0265039D250B82F5.

The claims of miscalibration of the honeybee odometer have been refuted thoroughly through scientific arguments in our rebuttals, and the journal Science has published an amendment to their news article to state that they stand by the 2000 paper which was the focus of these accusations.

On the important question of the claims about the statistical analysis in several of our papers, our co-author and expert biostatistician Geoffrey Stuart has provided an in-depth account of why these allegations have serious flaws. This has been pointed out in our rebuttals, as well as in this blog. The most serious allegations were based on erroneous simulations of our statistical analysis, leading to the conclusion that the R2 values in six of our studies were unrealistically high, with the imputation that our data were not genuine. In our 2000 Science paper, we were able to accurately recover the linear relationship between bee waggle duration and distance to a food source by measuring a very large number of waggle durations from several bees at each distance and then using those averages in the regression model. In the simulations that were the basis for the allegations, the underlying linear function was not taken into account. And, most importantly, only one waggle duration per bee was used in the simulations. As shown by Stuart [https://arxiv.org/abs/2408.07713], this is the reason that these simulations could not reproduce our high R2 values. This fundamental error, committed by those who made the allegations, has gone unnoticed by those have uncritically accepted the allegations.

In summary, we categorically refute any claims of deliberate duplication or falsification of data in any work conducted in our labs. We would urge participants in this blog to read our rebuttals carefully before making judgements about our work.

MV Srinivasan

Reply ↓

Statistical Modeling, Causal Inference, and Social Science

Fake data on the honeybee waggle dance, followed by the inevitable “It is important to note that the conclusions of our studies remain firm and sound.”

27 thoughts on “Fake data on the honeybee waggle dance, followed by the inevitable “It is important to note that the conclusions of our studies remain firm and sound.””

Leave a Reply Cancel reply