Skip to content

Measurement is part of design

The other day, in the context of a discussion of an article from 1972, I remarked that the great statistician William Cochran, when writing on observational studies, wrote almost nothing about causality, nor did he mention selection or meta-analysis.

It was interesting that these topics, which are central to any modern discussion of observational studies, were not considered important by a leader in the field, and this suggests that our thinking has changed since 1972.

Today I’d like to make a similar argument, this time regarding the topic of measurement. This time I’ll consider Donald Rubin’s 2008 article, “For objective causal inference, design trumps analysis.”

All of Rubin’s article is worth reading—it’s all about the ways in which we can structure the design of observational studies to make inferences more believable—and the general point is important and, I think, underrated.

When people do experiments, they think about design, but when they do observational studies, they think about identification strategies, which is related to design but is different in that it’s all about finding and analyzing data and checking assumptions, not so much about about systematic data collection. So Rubin makes valuable points in his article.

But today I want to focus on something that Rubin doesn’t really mention in his article: measurement, which is a topic we’ve been talking a lot about here lately.

Rubin talks about randomization, or the approximate equivalent in observational studies (the “assignment mechanism”), and about sample size (“traditional power calculations,” as his article was written before Type S and Type M errors were well known), and about the information available to the decision makers, and about balance between treatment and control groups.

Rubin does briefly mention the importance of measurement, but only in the context of being able to match or adjust for pre-treatment differences between treatment and control groups.

That’s fine, but here I’m concerned with something even more basic: the validity and reliability of the measurements of outcomes and treatments (or, more generally, comparison groups). I’m assuming Rubin was taking validity for granted—assuming that the x and y variables being measured were the treatment and outcome of interest—and, in a sense, the reliability question is included in the question about sample size. In practice, though, studies are often using sloppy measurements (days of peak fertility, fat arms, beauty, etc.), and if the measurements are bad enough, the problems go beyond sample size, partly because in such studies the sample size would have to creep into the zillions for anything to be detectable, and partly because the biases in measurement can easily be larger than the effects being studied.

So, I’d just like to take Rubin’s excellent article and append a brief discussion of the importance of measurement.

P.S. I sent the above to Rubin, who replied:

In that article I was focusing on the design of observational studies, which I thought had been badly neglected by everyone in past years, including Cochran and me. Issues of good measurement, I think I did mention briefly (I’ll have to check—I do in my lectures, but maybe I skipped that point here), but having good measurements had been discussed by Cochran in his 1965 JRSS paper, so were an already emphasized point.

And I wanted to focus on the neglected point, not all relevant points for observational studies.


  1. Bill Harris says:

    Andrew, I don’t see the link in the posting, but I presume is Rubin’s article in question.

  2. Alex says:

    I know a lot of Andrew’s posts on measurement reference psychology, but I also wanted to mention that reliability calculations and measurement have become a very important focus of the research I do with proteomic data.

    I think we spend a lot of time thinking about Type I errors, FDR control and the p-value debate, but when measurement is an issue false negatives are also a big concern. That is, they obfuscate strong dependancies that DO exist. In biology, researchers have tried to explain low empirical correlations between quantities (e.g. mRNA and protein) with complicated regulatory mechanisms. However, when you properly account for the large (and systematic errors) the true dependencies between molecules are quite strong. Andrew, you have said “correlation does not even imply correlation”, but I would say with measurement noise its equally true that “lack of correlation does not imply lack of correlation”. This may be obvious to statisticians, but it is often overlooked by practitioners.

  3. James says:

    When measurement is noisy as described this leaves us with a few options:

    1. Continue our research using current methods and make statements about the phenomenon under study that are more likely to be false than true.
    2. Focus our research on improving the measurement methods.
    3. Change our focus of research to an area where the measurement issues have already been addressed.

  4. jrc says:

    My lecture on Rob Jensen’s cell phone and fish prices paper was called “Empirical Design (greater than) Econometrics”. I think that he nails all 3 of the things we should be concerned about at the intersection of measurement and causal inference in observational studies: a) clear and convincing variation in the “treatment” variable (identification); b) a carefully measured outcome variable (reliability); and c) an analysis of changes in the outcome that is theoretically motivated and directly tests the hypothesis under consideration (validity).

    And then he puts it all in one beautiful graph. Figure 4 alternately makes we want to try a lot harder and just give up trying at all because I’ll never come up with anything that awesome.

  5. sechrest says:

    Problems of measurement in the social and behavioral sciences have been underestimated or ignored altogether for at least the time spanned by my career. Think of the effort that goes into measurement in other sciences, e.g., e.g., astrophysics, and the contrast is staggering.
    I apologize for the late response, but time zones and other demands on my time get in the way.


  6. Richard McElreath says:

    For historical completeness, worth noting that Cochran did write about causation some. At least there’s a passage from a 1965 paper that is fairly well known in evolutionary biology (because he quotes Fisher):

    About 20 years ago, when asked in a meeting what can be done in observational studies to clarify the step from association to causation, Sir Ronald Fisher replied: “Make your theories elaborate”. The reply puzzled me at first, since by Occam’s razor the advice usually given is to make theories as simple as is consistent with the known data. What Sir Ronald meant, as the subsequent discussion showed, was that when constructing a causal hypothesis one should envisage as many different consequences of its truth as possible, and plan observational studies to discover whether each of these consequences is found to hold.

    It’s in section 5 (page 252) of this paper:

    The section goes on to discuss issues related to sampling, measurement bias (sometimes through selection effects), and meta-analysis. It’s all very brief and vague, though.

  7. Keith O'Rourke says:

    As per last time and Rubin’s email, authors need to focus on some not all aspects of what they are writing on. The analogy is repairing a ship at sea, you can’t tear the ship completely apart to make all the needed repairs at once.

    That it appears on a reading of a select subset of writings of an author that X was not mentioned may be an interesting hypothesis but a worthwhile inference its not.

    • James says:

      I do not disagree with your analogy — but, the appropriate alternative is not the analogy of the Captain standing in the ship’s hull in water up to the waist saying, “Water? What water?”.

      • Keith O'Rourke says:

        James: There is always water and its a constant struggle have less on board.

        Another analogy, science is like standing in a bog – you stand where it seems firm now being always ready to move when it gives way.

        So an author focuses on where they currently see the bog giving way the most – right now.

        (Rubin’s comment: “And I wanted to focus on the neglected point, not all relevant points for observational studies.”)

        • James says:

          Another analogy with which I do not disagree — however, what I see happening in areas of psychology are the development of rickety structures that allow researchers to precariously stand in areas of the bog that would suck them under otherwise.

Leave a Reply