The piranha problem in social psychology / behavioral economics: The “take a pill” model of science eats itself

[cat picture]

A fundamental tenet of social psychology, behavioral economics, at least how it is presented in the news media, and taught and practiced in many business schools, is that small “nudges,” often the sorts of things that we might not think would affect us at all, can have big effects on behavior. Thus the claims that elections are decided by college football games and shark attacks, or that the subliminal flash of a smiley face can cause huge changes in attitudes toward immigration, or that single women were 20% more likely to vote for Barack Obama, or three times more likely to wear red clothing, during certain times of the month, or that standing in a certain position for two minutes can increase your power, or that being subliminally primed with certain words can make you walk faster or slower, etc.

The model of the world underlying these claims is not just the “butterfly effect” that small changes can have big effects; rather, it’s that small changes can have big and predictable effects. It’s what I sometimes call the “button-pushing” model of social science, the idea that if you do X, you can expect to see Y. Indeed, we sometimes see the attitude that the treatment should work every time, so much so that any variation is explained away with its own story.

In response to this attitude, I sometimes present the “piranha argument,” which goes as follows: There can be some large and predictable effects on behavior, but not a lot, because, if there were, then these different effects would interfere with each other, and as a result it would be hard to see any consistent effects of anything in observational data.

The analogy is to a fish tank full of piranhas: it won’t take long before they eat each other.

An example

I recently came across an old post which makes the piranha argument pretty clearly in an example. I was talking about a published paper from 2013, “‘Black and White’ thinking: Visual contrast polarizes moral judgment,” which fell in the embodied cognition category. The claim in that paper was that “incidental visual cues without any affective connotation can similarly shape moral judgment by priming a certain mindset . . . exposure to an incidental black and white visual contrast leads people to think in a ‘black and white’ manner, as indicated by more extreme moral judgments.”

The study had the usual statistical problem of forking paths so I don’t think it makes sense to take its empirical claims seriously. But that’s not where I want to go today. Rather, my point here is the weakness of the underlying theory, in light of all the many many other possible stories that have been advanced to explain attitudes and behavior.

Here’s what I wrote:

I don’t know whether to trust this claim, in light of the equally well-documented finding, “Blue and Seeing Blue: Sadness May Impair Color Perception.” Couldn’t the Zarkadi and Schnall result be explained by an interaction between sadness and moral attitudes? It could go like this: Sadder people have difficulty with color perception so they are less sensitive to the different backgrounds in the images in question. Or maybe it goes the other way: sadder people have difficulty with color perception so they are more sensitive to black-and-white patterns.

I’m also worried about possible interactions with day of the month for female participants, given the equally well-documented findings correlating cycle time with political attitudes and—uh oh!—color preferences. Again, these factors could easily interact with perceptions of colors and also moral judgment.

What a fun game! Anyone can play.

Hey—here’s another one. I have difficulty interpreting this published finding in light of the equally well-documented finding that college students have ESP. Given Zarkadi and Schnall’s expectations as stated it in their paper, isn’t it possible that the participants in their study simply read their minds? That would seem to be the most parsimonious explanation of the observed effect.

Another possibility is the equally well-documented himmicanes and hurricanes effect—I could well imagine something similar with black-and-white or color patterns.

But I’ve saved the best explanation for last.

We can most easily understand the effect discovered by Zarkadi and Schnall’s in the context of the well-known smiley-face effect. If a cartoon smiley face flashed for a fraction of a second can create huge changes in attitudes, it stands to reason that a chessboard pattern can have large effects too. The game of chess, after all, was invented in Persia, and so it makes sense that being primed by a chessboard will make participants think of Iran, which in turn will polarize their thinking, with liberals and conservatives scurrying to their opposite corners. In contrast, a blank pattern or a colored grid will not trigger these chess associations.

Aha, you might say: chess may well have originated in Persia but now it’s associated with Russia. But that just bolsters my point! An association with Russia will again remind younger voters of scary Putin and bring up Cold War memories for the oldsters in the house: either way, polarization here we come.

In a world in which merely being primed with elderly-related words such as “Florida” and “bingo” causes college students to walk more slowly (remember, Daniel Kahneman told us “You have no choice but to accept that the major conclusions of these studies are true”), it is no surprise that being primed with a chessboard can polarize us.

I can already anticipate the response to the preregistered replication that fails: There is an interaction with the weather. Or with relationship status. Or with parents’ socioeconomic status. Or, there was a crucial aspect of the treatment that was buried in the 17th paragraph of the publish paper but turns out to be absolutely necessary for this phenomenon to appear.

Or . . . hey, I have a good one: The recent nuclear accord with Iran and rapprochement with Russia over ISIS has reduced tension with those two chess-related countries, so this would explain a lack of replication in a future experiment.

I wrote the above in a silly way but my point is real:  Once you accept that all these large effects are out there, it becomes essentially impossible to interpret any claim—even from experimental data—as it can also be explained as an interaction of two previously-identified large effects.

Randomized experiment is not enough

Under the button-pushing model of science, there’s nothing better than a randomized experiment: it’s the gold standard! Really, though, there are two big problems with the sort of experimental data described above:

1. Measurement error. When measurements are noisy and biased, any patterns you see will not in general replicate—that is, type M and type S errors will be large. Meanwhile, forking paths allow researchers the illusion of success, over and over again, and enablers such as the editors of PNAS keep this work in the public eye.

2. Interactions. Even if you do unequivocally establish a treatment effect from your data, the estimate only applies to the population and scenario under study: psychology students in university X in May, 2017; or Mechanical Turk participants in May, 2017, asked about topic Y; etc. And in the “tank full of piranhas” context where just about anything can have a large effect—from various literatures, there’s menstrual cycle, birth order, attractiveness of parents, lighting in the room, subliminal smiley faces, recent college football games, parents’ socioeconomic status, outdoor temperature, names of hurricanes, the grid pattern on the edge of the survey form, ESP, the demographic characteristics of the experimenter, and priming on just about any possible stimulus. In this piranha-filled world, the estimate from any particular experiment is pretty much uninterpretable.

To put it another way: if you do one of these experiments and find a statistically significant pattern, it’s not enough for you to defend your own theory. You also have to make the case that just about everything else in the social psychology / behavioral economics literature is wrong. Cos otherwise your findings don’t generalize. But we don’t typically see authors of this sort of paper disputing the rest of the field: they all seem happy thinking of all this work as respectable.

I put this post in the Multilevel Modeling category because ultimately I think we should think about all these effects, or potential effects, in context. All sorts of confusion arise when thinking about them one step at a time. For just one more example, consider ovulation-and-voting and smiley-faces-and-political attitudes. Your time of the month could affect your attention span and your ability to notice or react to subliminal stimuli. Thus, any ovulation-and-voting effect could be explained merely as an interaction with subliminal images on TV ads, for example. And any smiley-faces-and-political-attitudes effect could, conversely, be explained as an interaction with changes in attitudes during the monthly cycle. I don’t believe any of these stories; my point is that if you really buy into the large-predictable-effects framework of social psychology, then it does not make sense to analyze these experiments in isolation.

P.S. Just to clarify: I don’t think all effects are zero. We inherent much of our political views from our parents; there’s also good evidence that political attitudes and voting behavior are affected by the economic performance, candidate characteristics, and the convenience of voter registration. People can be persuaded by political campaigns to change their opinions, and attitudes are affected by events, as we’ve seen with the up-and-down attitudes on health care reform in recent decades. The aquarium’s not empty. It’s just not filled with piranhas.

41 thoughts on “The piranha problem in social psychology / behavioral economics: The “take a pill” model of science eats itself

  1. “There can be some large and predictable effects on behavior, but not a lot, because, if there were, then these different effects would interfere with each other, and as a result it would be hard to see any consistent effects of anything in observational data.
    The analogy is to a fish tank full of piranhas: it won’t take long before they eat each other.”

    Not only can the piranhas turn on each other, they can also possibly turn into goldfish!

    Here is the, always brilliant (-ly funny), social psychologist Dijksterhuis on the latest replication failure of his work:

    1) “However, the finding that the priming effect is evident among “naive” participants supports earlier findings showing that priming effects disappear or even reverse among people who are aware of the potential effects of the prime”

    2) “Moreover, priming as a mechanism is much more well‐known than in the past, definitely among students, but most likely also outside academic psychology. In most countries packs of cigarettes contain images that people know are supposed to influence their behavior. The idea that merely being exposed to something that may then exert some kind of influence is not nearly as mystifying now as it was twenty years ago.”

    So, if i am understanding Dijksterhuis correctly, would that not imply that images on cigarettes may acutally *cause* folks to smoke more?!

    And if that may be correct, who should we hold accountable for this?

    https://www.psychologicalscience.org/redesign/wp-content/uploads/2017/11/Dijksterhuis_RRRcommentary_ACPT.pdf

    • Anon:

      Yes, in the linked article, Dijksterhuis writes, “The idea that merely being exposed to something that may then exert some kind of influence is not nearly as mystifying now as it was twenty years ago.”

      But the thing he doesn’t seem to realize is that, as Euclid might put it, there are an infinite number of primes. He mentions that “packs of cigarettes contain images that people know are supposed to influence their behavior.” But that’s hardly a “prime” the way we usually understand it: it’s a direct message that smoking kills. Here are some primes in the sense of hidden signals of the sort that Bargh etc. like to study: the shape of the cigarette box, the color of the box, the font on the box (is it an old font or a young font? etc.), the color of the worse, the relative sizes of different text on the box, the color and size of the picture of the camel on the box, the place where the cigarettes are being sold, the body language of whoever’s selling the cigs, etc etc etc. As discussed in my above post, there’s no way that all or even most of those primes could be having big effects, and these studies always seem to be set up as if all that matters is the particular prime being focused on right then, with all the others being irrelevant. So, in a study of A, all that matters is A, and factor B is ignored, it’s not even measured; and in a study of B, factor A is ignored. With no sense of the incoherence of these different studies and their claims.

      • This is the Dijksterhuis of “unconscious thought theory” fame, whose publication history reads like a walk through the garden of forking paths. First, decisions are better after allowing for “unconscious process,” then only certain types of decisions, then only by certain metrics, etc.

        • “This is the Dijksterhuis of “unconscious thought theory” fame, whose publication history reads like a walk through the garden of forking paths. First, decisions are better after allowing for “unconscious process,” then only certain types of decisions, then only by certain metrics, etc.”

          Yep, that’s the one!

          He’s like the Dutch version of Bargh, with possibly similar “divaesque” behaviour when someone criticizes his work (see comment section here for instance: http://journals.plos.org/plosone/article/comments?id=10.1371/journal.pone.0056515). I think he’s hilarious.

          * Unconscious Thought Theory (UTT): https://en.wikipedia.org/wiki/Unconscious_thought_theory

          “Researchers Ap Dijksterhuis, Maarten W. Bos, Loran F. Nordgren, and Rick B. van Baaren tested this hypothesis in a series of studies measuring choice quality and post-choice satisfaction after participants used conscious and unconscious deliberation. The studies supported the deliberation-without-attention effect: conscious thinkers were better able to make normatively more desirable choices between simple products, but unconscious thinkers were better able to choose between complex products.”

          * Failed replication of UTT and meta-analysis: https://www.researchgate.net/publication/270575120_On_making_the_right_choice_A_meta-analysis_and_large-scale_replication_attempt_of_the_unconscious_thought_advantage

          “Consistent with the reliability account, the large-scale replication study yielded no evidence for the UTA, and the meta-analysis showed that previous reports of the UTA were confined to underpowered studies that used relatively small sample sizes. Furthermore, the results of the large-scale study also dispelled the recent suggestion that the UTA might be gender-specific. Accordingly, we conclude that there exists no reliable support for the claim that a momentary diversion of thought leads to better decision making than a period of deliberation.

          * Why the original published UTT findings did not even show (strong) evidence: https://osf.io/preprints/psyarxiv/j944a/

          “As a case study,we perform a rigorous post-publication peer review of the theoretical core of Unconscious Thought Theory(UTT). We present several uncomplicated indices that quantify the quality of evaluation of the results and conclusions by experts who reviewed the article and the amount of interpretation bias on behalf of the authors. The results reveal a failure of expert peer-review to detect empirical reports of sub-standard quality. The analyses reveal there is in fact hardly any empirical evidence for the predictions of UTT in important papers that claim its support. Our advice is to evaluate before you replicate.”

    • I think he is talking about “fear inducing” pictures (gross pictures of peoples dying from smoking or a pic of a dying man’s lungs). Don’t know if they are effective though.

  2. This post underscores the importance of distinguishing between (1) effects that shed light on the workings of human psychology, and (2) effects of practical importance in shaping behavior. The piranha argument is effective in countering (2), but still leaves a lot of room for (1). Priming effects might be entirely illusory (an artifact of forking paths, etc.), but to the extent they are real they tell us something about how human behaviors are linked to environmental cues, and that seems valuable in itself.

    TED-talkers need to be careful not to overplay the practical implications, because these studies are a long way from establishing that that linkages can be used to shape behavior with the magnitude and consistency to allow useful application. That’s where I see the piranha analogy coming in. But critics also need to be careful not to use that analogy to discount the research entirely (assuming it survives forking paths critiques), since even small and fleeting effects can help us develop a better understanding of human psychology that may one day be useful, and in the meantime is interesting in itself to those who care about such things. (And who are we to tell people in other fields what they should care about?)

    • An analogy I’ve grown to like for the “small and fleeting effects” is the color of a box vs its volume.

      If you didn’t know the relationship V = l*w*h, you may use NHST to “test” whether or not zillions of things were significantly correlated with volume. As a result you would find many irrelevant, yet real correlations. Things like the color of the box, material of the box, local humidity, etc.

      You can then run experiments to support such results. Take a red box with Volume V0 and paint it a new color (eg blue). The paint will contribute to the thickness of the walls and thus affect the volume of the box. If volume is being estimated from outside it will seem to increase it slightly (V1 > V0), yet from inside it will seem to decrease (V1 < V0). This would be considered a mysterious paradox that requires more research on additional boxes in various conditions.

      The paint effect would then be seen to be modulated by color (since some pigments will be denser than others), temperature, etc. What if we measure a red box, then paint it red again? Well it got thicker but the shade of red also changed a bit so maybe it is just that the color changes at all…

      I really don't see a limit to how much time/money can be wasted going down this path without ever figuring out V = l*w*h. It seems completely inappropriate.

    • Rjb:

      I agree that small and fleeting effects can be important. But I don’t think they can be studied using noisy experiments. Researchers seem to have a naive faith in randomization: they think that if an effect is measured in a randomized experiment, it’s kosher no matter what are the measurements and who’d being measured (see for example the quote here). But in a world of small effects and large interactions: No, the pattern in the data is only interesting to the extent that it generalizes, or that it fits a pre-existing model.

      So, yes on studying small and fleeting effects (which can vary in sign as well as magnitude); no on estimating them using conventional between-person designs with crappy measurements.

    • (And who are we to tell people in other fields what they should care about?)

      “We” might be the tax-payers who might have some right to have a say in things, or at least question things scientists investigate and how they do it.

      Although i do not think it is appropriate to have some sort of democracy concerning things researchers study, my comment is just meant to perhaps make clear that because scientists often get payed by the tax-payer to do their job, one could question whether they are totally free to do what they want, the eway they want to.

        • “Science is an investment, not an entitlement”

          Yes, thank you for this!

          As a student i did my thesis-research and had a $600 budget to spend on it. I desgined a study, and used around half of the money because i thought it would be enough for my purpose, and i didn’t want to waste any more than necessary. Note that i did not got to keep the other half of the money or something like that, it was purely reasoning from a “let’s not waste more resources than necessary”-perspective.

          When i read about all the money that gets asked for (and possibly wasted) for simple psychological research i find that highly offensive to the tax-payer. I think a lot of this money is not really for research, but i am not familiar with how things go on that level. I think i did at one time heard something that shocked me along the lines of “we have to spend all our money or otherwise we won’t get the same amount next year”.

          I get the idea that a lot of scientists never think about whether the money they spend on (or ask for) something is worthwhile from a scientific, moral, and ethical perspective.

        • Anon:

          Indeed, from the university’s perspective, the more money you get and spend, the better. I’ve been at promotion reviews where grant-getting was specifically raised as a criterion (as in: “Is this person really a strong enough candidate? I don’t see many grants”). OK, one can rationalize this: grants are peer-reviewed so in that sense it’s as legitimate to count grants as to count papers. Still, it’s troubling for the reasons you say. In some sense, the biggest surprise is that researchers don’t compete even harder for the grant funding that’s out there.

          I will, however, come to the defense of some of the worst psychology research out there, the kind of study where someone does a 10-minute experiment on 20 undergrads or gives out a survey to 100 Mechanical Turk participants. Say what you want about such studies (and I’ll say they’re typically useless), but they are cheap! Some studies, such as the notorious beauty-and-sex-ratio paper, use available data and are essentially free! FMRI experiments, that’s another story.

        • “Indeed, from the university’s perspective, the more money you get and spend, the better.”

          Why is this, i seriously don’t understand.

          They only thing i can come up with, and have only recently read/heard about, is that perhaps universities get a share of the money that is supposed to go to the actual research and the researcher, but i m not sure about how things go on that level.

          I don’t understand why a university would care about how much grant money an individual researcher attracts, in the same way that i just don’t understand why a university would care about how many papers their faculty publishes.

        • Anon:

          1. Yes, the university gets a cut from each external research grant. Universities such as Columbia rely on their cuts from these grants as part of their overall budget.

          2. Grants are competitive and peer-reviewed, thus getting a grant is a form of outside endorsement of the quality of a researcher’s work. It’s a noisy signal but it’s a signal, just as publications are a signal.

  3. I think the question of what the world would look like if we were buffeted by large effects of minor “treatments” is very interesting. I’m having trouble following the thread of why a given effect would be hard to detect in a buffeted-by-minor-treatments world. Let’s say that we’re running a randomized experiment trying to estimate the effect of treatment A vs B on well measured outcome Y. Is the idea that for many participants in the study treatment A vs B won’t make any difference because their Y was already determined by other factors (such as whether the experimenter wore sneakers or loafers and the color of the room)? i.e. The variance of the estimated effect would be high because the number of people actually influenced by the treatment under study is low? If this is the argument, I think it would be interesting to see how it plays out in some simulations. (But I still don’t see how if you do find an effect of your treatment in a study it can be interpreted as an interaction of other effects…)

    What I find amusing about this is that most people who argue for the plausibility of large effects of minor treatments do so in service of justifying small sample sizes, but in reality their argument might suggest they would need a larger sample to detect their particular effect. (Though simulations/modeling would be necessary to see what the tradeoff is between my effect being large and all others being large on what sample size I probably need.)

    • Z:

      There’s no way all these large effects could be additive, or one would quickly reach the ends of the measurement scales. Nor can they be additive on a transformed scale such as the logit, or again one would quickly reach predicted probabilities that approach 0 or 1. Hence if there are large effects, they’ll need to have large interactions. So the treatment effect of A is meaningless without knowing the values of factors B, C, D, E, etc., in the experiment. But B, C, D, E, etc. will not be controlled in the experiment; they just are what they are. When was the last time you saw a priming experiment that also controlled the pose of the experimenter? When was the last time you saw a power pose experiment that also controlled for priming in the instructions? (And even if these were done, you’d still have to worry about C, D, E, etc.: all those factors such as relationship status, number of older siblings, color of clothing, outdoor air temperature, parents’ socioeconomic status, and so many other variables that have been declared as crucial moderators in various published studies in the psychology literature.)

      • I don’t think this argument works if there are both primes and antiprimes. Imagine an action whose probability of occurrence you want to predict. Assume there are 10e^10 possible tiny incremental factors. But then the absence of those factors then have an inhibiting effect. On average, half of them are positive and half are negative (although in any particular case some of these unmeasured factors will have a net inhibitory or net causative effect.) The we have the standard probit model: all of the tiny unmeasured factors go into the error term, and by the central limit theorem they are normally distributed in aggregate impact.

        I don’t believe any of this, but as long as there are inhibitory primes and causing primes, I think it follows.

        • Jonathan:

          If all the effects are tiny, sure, it could work. But if many of the effects are large, then the variance becomes too big, and you’ll get almost everybody having a predicted probability very close to 0 or 1.

      • “There’s no way all these large effects could be additive, or one would quickly reach the ends of the measurement scales.”

        This philosophical point is surprisingly hard for a lot of applied researchers to think through… or at least easy enough for them to dismiss as unnecessary technical nuance (you know, like the meaning of a p-value). I come across it fairly often in the world of quasi-experimental causal inference. In that situation it often goes like: hey, I’ve found that distance from something/football scores/hot days/local dentists named Dennis/snarky internet comments/etc. had an effect of Delta on Y. And of course Delta is often about as big as what 1 year’s worth of income/education/incarceration would produce in Y, because that is how big the estimate has to get to be statistically significant (otherwise you would never be talking about the paper – you’d be talking about at the variable that was a 1 in 20 draw).

        So my formulation of the problem has recently taken the form of: “If manipulation X has an effect of Delta, and I’ve been exposed to 74,562 stimuli today that are equivalently manipulative of my psyche as manipulation X, why does manipulation X matter? Like, matter at all?” Maybe I am just a needlessly confrontational jerk (I’ve been described that way before)… but there is something about trying to put the argument into the context of the jumbled, chaotic, over-stimulating mess of a world we inhabit that I think (err… hope?) helps the rhetorical case. Or, again, maybe just a jerk.

  4. This is an interesting thread but a few useful, operative concepts are missing. These include decision-making processes that distinguish between additive and multiplicative models, hard vs soft attributes and the principle of IIA (independence of irrelevant alternatives).

    The distinction between additive and multiplicative approaches in modeling is nontrivial. Assume a set of attributes, features or ‘primes’ descriptive of or contributing to a decision process. When these attributes are added together, attributes which possess zero value will have little or no impact on the decision. On the other hand, multiplicative models stipulate that multiplying a set of attributes together that include even a single attribute possessing zero value, ‘zero’ will result for the entire process.

    This distinction between may be best illustrated with a few examples. Imagine you are shopping in a grocery store for fresh food, e.g., salad greens or vegetables. Relevant attributes contributing to your purchase decision could include price, whether or not it’s on promotion that week, preferences for a specific type of salad green (e.g., endive, romaine, arugula), packaging, color and, most importantly, whether it’s fresh or rotten (yes/no). Now, suppose these attributes are all positive but the item is rotten and, therefore, is ‘no’ or zero for that feature. In an additive world, you would still have an overall positive attitude towards purchase of that item. In a multiplicative world, on the other hand, given that a single attribute is zero, that item would never be considered for purchase. Next, the Challenger space shuttle explosion and disaster. Hundreds, if not thousands, of factors were involved in the decision to launch. If analysts had used a multiplicative model in forming the likelihood for launch risk or success, the fact that a single attribute, such as outside air temperature, was below a threshold of safety could have reset the likelihood of launch success to zero. Next, consider choices related to purchasing tickets to a just-released film. Again, many attributes are involved in this decision including buzz or word-of-mouth, critic’s reviews, genre (comedy, drama, action, animation, etc.), the director and the cast. Suppose the cast includes both Tom Cruise and Owen Wilson and suppose you hate both Tom Cruise and Owen Wilson. With a multitude of features involved, in an additive world their casting in the film would matter little but, in a multiplicative world you would never waste the money.

    These examples also help illuminate the distinction between hard and soft attributes as well as the IIA. Attributes such as price, product type, state of freshness, outside temperature and film genre are hard, whereas attributes such as word-of-mouth, packaging, color and opinions about actors are soft. In conjoint studies of choice wrt mode of transportation (e.g., walk, bike, drive, bus, train, airplane), mode is a hard attribute but, in an IIA world, the decision-maker would be indifferent to whether the transportation mode is green or blue — the classic green bus-blue bus problem in transportation choice modeling.

    There are entire industries where leverage of the principle of IIA has had huge relevance, for instance, differences in the design of PCs vs Macs. Until recently, PCs were designed and built based on conjoint studies of utility focused on hard attributes and features such as chip speed, RAM, size of hard drive, screen size, weight, and so on. In other words, wrt PCs the principle of IIA was dominant. Macs, on the other hand, were designed with less of an important role for IIA. Mac design considerations included features similar to and competitive with PCs but greater weight was given to softer attributes such as the machine’s color and its graphics capabilities. This made Macs more attractive to artists, creatives and designers.

    In this welter of thousands of possible soft design attributes, a true garden of forking paths, the result has been that Apple (Mac) has total market dominance over Microsoft and IBM. Did statistically driven decision-making focused on hard attributes thwart the success of Microsoft and IBM? In other words, in this example where did the piranhas feast?

    • In an additive world, you would still have an overall positive attitude towards purchase of that item. In a multiplicative world, on the other hand, given that a single attribute is zero, that item would never be considered for purchase.

      Is multiplicative the right term? In a normal decision tree you would just have “is rotten” very near the root. If ‘is rotten == “Yes”‘, then don’t buy, else move on to “is price > x”, etc. I don’t see how that could be called “multiplicative” but perhaps it is in some way.

      Eg, like the decision to settle the lawsuit here:
      https://en.wikipedia.org/wiki/Decision_tree#Decision_tree_using_flowchart_symbols

      • The terminology may be more intuitive if you think of it, not as a decision tree, but as a horizontal array of attributes which are being combined to produce a result — either a sum (additive) or a product (multiplicative).

        • if you think of it, not as a decision tree, but as a horizontal array of attributes which are being combined to produce a result — either a sum (additive) or a product (multiplicative).

          Sure, I guess what I am asking for is some kind of proof/demonstration that these are equivalent (in general or under some conditions). If they are equivalent, which is more expensive to evaluate? I mean, if this exists it is probably some really basic thing I have missed and just do not know the terminology to search for it.

        • Nothing’s “equivalent” here. Additive vs multiplicative are completely different approaches to summarizing information. The distinction between an array and a decision tree boils down to this: an array, however done, summarizes information across a row, a record, an observation or, alternatively within a subject. On the other hand, a tree is partitioning information across subjects, not at the unique row, record, observation or subject level of analysis.

        • Nothing’s “equivalent” here.

          Sorry, I thought by tree and array you were describing two ways to encode the same information.

          The distinction between an array and a decision tree boils down to this: an array, however done, summarizes information across a row, a record, an observation or, alternatively within a subject. On the other hand, a tree is partitioning information across subjects, not at the unique row, record, observation or subject level of analysis.

          I don’t follow why a tree would be partitioning information across subjects. For example (hopefully this shows up ok…):


          [Fresh]
          no / \ yes
          / \
          [leave] [price < threshold]
          no / \ yes
          / \
          leave buy

          Why can I not apply this decision tree to a single piece of fruit? The decision is to either leave the store or buy it.

        • Ok, perhaps it trims all leading whitespace… lets see:

          | [Fresh]
          | no / \ yes
          | / \
          | [leave] [price > threshold]
          | no / \ yes
          | / \
          | leave buy

          Also, it should have been price > threshold. I didn’t mean to bring up whole foods.


  5. ......[Fresh].........................
    ...no./.....\.yes.....................
    ...../.......\........................
    ..[leave]....[price > threshold]......
    ...........no./.........\.yes.........
    ............./...........\............
    .........[leave].......[buy]..........

    What are the rules here?

      • Ok, forget my attempt at drawing a decision tree that way. It is only leading to confusion and formatting issues. Does this make sense to you:
        https://image.ibb.co/cez6n6/decision_Tree.png

        Can you see how it describes the process of buying a vegetable from a store based on whether it is fresh and priced correctly? This is what I mean by a “decision tree” (which is, afaik, the standard meaning).

        When you discussed decision trees you mentioned “partitioning information across subjects”, so I have no idea what you mean by “decision tree”. Can you draw a decision tree to show what you mean?

        • No thanks wrt drawing a tree. It appears that we have completely different frameworks anchoring our perspectives on this discussion.

          If it helps to clarify anything, my unit of analysis is the observation, record or subject, whereas your unit of analysis seems to be at a higher level of analysis or aggregation, e.g., a fruit. Ever heard of a data cube? (e.g., here … https://www.google.com/url?sa=i&rct=j&q=&esrc=s&source=images&cd=&cad=rja&uact=8&ved=0ahUKEwiZiMTRxpHYAhVX02MKHf8DD1QQjRwIBw&url=https%3A%2F%2Fdocs.eazybi.com%2Fdisplay%2FEAZYBIJIRA%2FData%2Bmodel&psig=AOvVaw0dXBSDo-AiuhAmAWuGHoDU&ust=1513617392516791) My unit of analysis corresponds to a single, horizontal slice of that cube. Decision trees partition information vertically for a particular object, e.g., a fruit. What should be apparent to you is that a “decision tree” will not execute with a unit of analysis as granular as an a unique observation or record.

          If that doesn’t make sense to you, I give up attempting any further explanations.

        • So the “fruit” (I later switched to vegetable for some reason but ok) has two properties in this case;

          1) freshness/rottenness
          2) whether the price is above or below some threshold

          You would say those are two observations/records?

          If that doesn’t make sense to you, I give up attempting any further explanations.

          I think this frustration is premature. You have so far attempted to explain what you mean *only* in general terms, rather than giving specific examples. I asked for a specific example (eg “draw what you mean by decision tree”) because usually this is very helpful.

          To bring it back to the original point:

          Imagine you are shopping in a grocery store for fresh food, e.g., salad greens or vegetables. Relevant attributes contributing to your purchase decision could include price, whether or not it’s on promotion that week, preferences for a specific type of salad green (e.g., endive, romaine, arugula), packaging, color and, most importantly, whether it’s fresh or rotten (yes/no). Now, suppose these attributes are all positive but the item is rotten and, therefore, is ‘no’ or zero for that feature. In an additive world, you would still have an overall positive attitude towards purchase of that item. In a multiplicative world, on the other hand, given that a single attribute is zero, that item would never be considered for purchase.

          It sounds like your unit of analysis is the specific produce being inspected. This produce has various attributes (price, promotion, type, freshness) each of which will be a dimension of your data cube*. Yet you say “My unit of analysis corresponds to a single, horizontal slice of that cube.” Isn’t your unit of analysis a single cell in the datacube, where other cells correspond to the same attributes of other vegetables?

          Anyway, it seems like there are a lot of issues with terminology here…

          *Having never used a data cube, I don’t grasp the difference between that and an array w/ the same number of dimensions, sorry

        • The terminology used is indeed general, if by ‘general’ you mean ‘widely accepted and understood.’ Any remaining issues with terminology are mostly yours in pretending not to understand the distinction between a sum and a product, in insisting on one and only one way of working with a matrix of data — decision trees — and in failing to do any research related to even basic concepts such as a data cube.

          Nuff said. I’m out of here.

        • pretending not to understand the distinction between a sum and a product

          Wow, I am really sorry you think I’m trolling you by asking for clarification. I can’t see where I failed to understand the difference between sum and product at all???

          I thought maybe you were describing a more efficient way of accomplishing the same thing as a decision tree and would be interested in that (possibly to be used in the context of CARTs). I have no idea where this impression is coming from… Sorry, really.

  6. I too wonder about the ‘nudge’ and the unconscious ease with which it is alleged to work. Like the butterfly effect of myth, the noise buffer is likely to be so great as all we are left with is random effects in either case.

  7. I’m very struck by your juxtaposition of the butterfly effect with what you describe as “the large-predictable-effects framework of social psychology”. Do you have any suggestions for books or papers to explore for those interested in exploring causality in chaotic systems? Is that even a thing?

  8. Very interesting work. But isn’t the piranha effect a trivial consequence of standardization? If you scale by the DV’s total variance (or total “information” or total whatever), increasing one variable’s share necessarily decreases all other variables’ shares, even if the total variance increases.

    I believe the measures also end up standardizing by the IV as well, which has strange consequences especially when effects are causal. One can discover a casual relationship in the lab that has 0% R^2 at the time of discovery, and shortly after, a higher R^2 as people adopt the intervention and eventually a trivial R^2 after nearly everyone has adopted it. A simple example: imagine a perfectly effective LASIK-like eye surgery given at birth to anyone with less than 20/20 vision. The surgery would have a low R^2 at the time of discovery and after widespread adoption, but it would be foolish to call it unimportant.

    But I get the point that when taken as a whole, studies making claims about R^2 can be contradictory because there’s only so much R^2 to go around. But maybe the implication is that we shouldn’t take R^2 too seriously?

Leave a Reply

Your email address will not be published. Required fields are marked *