Skip to content

Can we do better than using averaged measurements?

Angus Reynolds writes:

Recently a PhD student at my University came to me for some feedback on a paper he is writing about the state of research methods in the Fear Extinction field. Basically you give someone an electric shock repeatedly while they stare at neutral stimuli and then you see what happens when you start showing them the stimuli and don’t shock them anymore. Power will always be a concern here because of the ethical problems.

Most of his paper is commenting on the complete lack of constancy between and within labs in how they analyse data. Plenty of Garden of forking paths, concerns about type 1, type 2 and S and M errors.

One thing I’ve been pushing him is to talk about more is improved measurement.

Currently fear is measured in part by taking skin conductance measurements continuously and then summarising an 8 second or so window between trials into averages, which are then split into blocks and ANOVA’d.

I’ve commented that they must be losing information if they are summarising a continuous (and potentially noisy) measurement over time to 1 value. It seems to me that the variability within that 8 second window would be very important as well. So why not just model the continuous data?

Given that the field could be at least two steps away from where it needs to be (immature data, immature methods), I’ve suggested that he just start by making graphs of the complete data that he would like to be able to model one day and not to really bother with p-value style analyses.

In terms of developing the skills necessary to move forward: would you even bother trying to create models of the fear extinction process using the simplified, averaged data that most researchers use or would it be better to get people accustomed to seeing the continuous data first and then developing more complex models for that later?

My reply:

I actually don’t think it’s so horrible to average the data in this way. Yes, it should be better to model the data directly, and, yes, there has to be some information being lost by the averaging, but empirical variation is itself very variable, so it’s not like you can expect to see lots of additional information by comparing groups based on their observed variances.

I agree 100% with your suggestion of graphing the complete data. Regarding measurement, I think the key is for it to be connected to theory where possible. Also from the above description it sounds like the research is using within-person comparisons, which I generally recommend.


  1. Joana says:

    Hello Professor Andrew! I’m taking a master’s degree in molecular biomedicine and I have a class of biostatistics. I need to give a presentation about a scientific article that has faulty statistic. The sotfwares that I use in classes are SPSS and GPower. It’s really hard to find a good article (related with the clinical area and with the softwares that I use) with “bad” statistics. My professor wants to see errors like “the sample size is not right” or “it has wrong conclusions because of the p-value” etc. Perhaps you know something? I didn’t know how to contact you, so I used this comments section, I’m so sorry… Thanks in advance!!

    • Jonas says:

      Joana if you want to directly contact Gelman you should look at the webpage from his university But I don’t think that he can help you as he probably isn’t familiar with molecular biomedicine. I have some bad -omics papers if that could be of interest to you.
      Can you tell me why your supervisor needs the software to be the same?

      • Joana says:

        Thank you, Jonas. I’m interested!! I think its because we only learnt the basics tests, like t test, z test, etc. In the next weeks, we will learn meta analysis.

        • Yes and maybe the next week you can learn neurosurgery.

          • So I was encouraged to clarify here, because my comment was pretty snarky, and honestly it’s not fair to snark on the poor student. The people I’m upset with here are the teachers of statistics and the textbook writers and the universities and online courses etc because the impression given by those educators is that statistics is a pushbutton field where in a week or so they can teach you about the buttons you need to push to do meta analysis. This attitude is responsible for a lot of bad things in the world, especially for example in bio-medical research where researchers routinely use inappropriate thinking because it’s what they were taught stats was about. So I apologize for the tone towards Joanna but still am upset with the underlying problem.

  2. This kind of averaging can have serious adverse consequences in psycholinguistics (reading studies). It hides an important variance component.

  3. Justin Smith says:

    I’d say it could be good or bad to take the average like this. Typically, as mentioned, using the continuous measure is preferable. On the other hand, there are a lot of successful quality control approaches that do the analysis on the averages of batches.


Leave a Reply