Skip to content

Hey! Participants in survey experiments aren’t paying attention.

Gaurav Sood writes:

Do survey respondents account for the hypothesis that they think people fielding the survey have when they respond? The answer, according to Mummolo and Peterson, is not much.

Their paper also very likely provides the reason why—people don’t pay much attention. Figure 3 provides data on manipulation checks—the proportion guessing the hypothesis being tested correctly. The change in proportion between control and treatment ranges from -.05 to .25, with a bulk of the differences in Qualtrics between 0 and .1. (In one condition, authors even offer an additional 25 cents to give an answer consistent with the hypothesis. And presumably, people need to know the hypothesis before they can answer in line with it.) The faint increase is especially noteworthy given that on average, the proportion of people in the control group who guess the hypothesis correctly—without the guessing correction—is between .25–.35 (see Appendix B).

So, the big thing we may have learned from the data is how little attention survey respondents pay. The numbers obtained here are in a similar vein to those in Appendix D of Jonathan Woon’s paper. The point is humbling and suggests that we need to: (a) invest more in measurement, and (b) have yet larger samples, which is an expensive way to overcome measurement error.

(The two fixes are things you have made before. I claim no credit. I wrote this because I don’t think I fully grasped how much noise there is on online surveys. And I think is likely useful to explore the consequences carefully.)

P.S. I think my Mturk paper gives one potential explanation for why things are so noisy—not easy to judge quality on surveys:

Here’s one relevant bit of datum from the turk paper: we got our estimates after recruiting workers “with a HIT completion rate of at least 95%.” This latter point also relates to a recent “reputation inflation” online paper.

So, if I’m understanding this correctly, Mummolo and Peterson are saying we don’t have to worry about demand effects in psychology experiments, but Sood is saying that this is just because the participants in the experiments aren’t really paying attention!

I wonder what this implies about my own research. Nothing good, I suppose.

P.S. Sood adds three things:

1. The numbers on compliance in M/P aren’t adjusted for guessing—some people doubtlessly just guessed the right answer. (We can back it out from proportion incorrect after taking out people who mark “don’t know.”)

2. This is how I [Sood] understand things: Experiments tell us the average treatment effect of what we manipulate. And the role of manipulation checks is to shed light on compliance.

If conveying experimenter demand clearly and loudly is a goal, then the experiments included probably failed. If the purpose was to know whether clear but not very loud cues about “demand” matter—and for what it’s worth, I think it is a very reasonable goal; pushing further, in my mind, would have reduced the experiment to a tautology–—the paper provides the answer. (But your reading is correct.)

3. The key point that I took from the experiment, Woon, etc. was still just about how little attention people pay on online surveys. And compliance estimates in M/P tell us something about the amount of attention people pay because compliance in their case = reading something—simple and brief—that they quiz you on later.

Tomorrow’s post: To do: Construct a build-your-own-relevant-statistics-class kit.

When Prediction Markets Fail

A few years ago, David Rothschild and I wrote:

Prediction markets have a strong track record and people trust them. And that actually may be the problem right now. . . . a trader can buy a contract on an outcome, such as the Democratic nominee to win the 2016 presidential election, and it will be worth $1 if the outcome occurs and $0 if the outcome does not occur. The price at which people are willing to buy and sell that contract can be interpreted as the probability of the outcome occurring, or at least the collective subjective probability implicitly assigned by the crowd of people who trade in these markets. . . .

But more recently, prediction markets have developed an odd sort of problem. There seems to be a feedback mechanism now whereby the betting-market odds reify themselves. . . .

Traders are treating market odds as correct probabilities and not updating enough based on outside information. Belief in the correctness of prediction markets causes them to be too stable. . . . pollsters and pundits were also to some extent anchoring themselves off the prediction odds. . . .

And that’s what seems to have happened in the recent Australian election. As Adrian Beaumont wrote:

The poll failure was caused in part by “herding”: polls were artificially too close to each other, afraid to give results that may have seemed like outliers.

While this was a failure for the polls, it was also a failure of the betting markets, which many people believe are more accurate than the polls. . . . the Betfair odds . . . implying that the Coalition had only an 8% chance of winning. . . . It is long past time that the “betting markets know best” wisdom was dumped. . . .

I don’t want to overstate the case here. The prediction markets are fine for what they are. But they’re a summary of what goes into them, nothing more.

P.S. Yes, if all is calibrated, if the stated probability is 8%, then the event will occur 8% of the time. You can’t demonstrate lack of calibration from one prediction. So let me flip it around: why should we assume that the prediction markets are some sort of oracle? Prediction markets are a particular information aggregation tool that can be useful, especially if you don’t take them too seriously. The same goes for any other approach to information aggregation, including those that I’ve promoted.

Australian polls failed. They didn’t do Mister P.

Neil Diamond writes:

Last week there was a federal election in Australia. Contrary to expectations and to opinion polls, the Government (a coalition between the Liberal (actually conservative) and National parties, referred to as LNP or the Coalition) was returned with an increased majority defeating the Australian Labor Party (ALP or Labor, no “u”).

Voting in Australia is a bit different since we have compulsory voting, that is you get fined if you don’t vote, and we have preferential voting. Allocation of preferences is difficult and sometimes based on what happened last election and the pollsters all do it differently.

Attached is a graph of the two party preferred vote over the last three years given by Kevin Bonham, one of the most highly regarded poll analysts in Australia. Note that in Australia Red means Labor and Blue means Liberal. The stars correspond to what actually happened at the election.

Since the election there has been much analysis of what went wrong with the polls. I’m attaching two links—one by a Nobel Laureate, Professor Brian Schmidt of the Australian National University, who pointed out that the published polls had a much lower variability than was expected, and another (very long) post from Kevin Bonham which looks at what has happened and suggests among other things that the polls “may have been oversampling voters who are politically engaged or highly educated (often the same thing).”

Diamond also links to this news article where Adrian Beaumont writes:

The Electoral Commission’s two party preferred projection is . . . the Coalition wins by 51.5-48.5 . . . Polls throughout the campaign gave Labor between 51 and 52% of the two party preferred vote. The final Newspoll had a Labor lead of 51.5-48.5 [in the other direction as what happened; thus the polls were off by 3 percentage points] . . . I [Beaumont] believe the poll failure was caused in part by “herding”: polls were artificially too close to each other, afraid to give results that may have seemed like outliers.

While this was a failure for the polls, it was also a failure of the betting markets, which many people believe are more accurate than the polls. . . . the Betfair odds . . . implying that the Coalition had only an 8% chance of winning. . . . It is long past time that the “betting markets know best” wisdom was dumped. . . .

Another reason for the poll failure may be that pollsters had too many educated people in their samples. Australian pollsters ask for age and gender of those they survey, but not for education levels. Perhaps pollsters would have been more accurate had they attempted to stratify by education to match the ABS Census statistics. People with higher levels of education are probably more likely to respond to surveys than those with lower levels.

Compulsory voting in Australia may actually have contributed to this problem. In voluntary voting systems, the more educated people are also more likely to vote. . . .

If there is not a large difference between the attitudes of those with a high level of education, and those without, pollsters will be fine. . . . If there is a big difference, as occurred with Trump, Brexit, and now it appears the [Australian] federal election, pollsters can miss badly. If you sort the seats by two party swing, those seats that swung to Labor tended to be highly educated seats in the cities, while those that swung biggest to the Coalition were regional electorates. . . .

I’m surprised to hear that Australian polls don’t adjust for education levels. Is that really true? In the U.S., it’s been standard for decades to adjust for education (see for example here). In future, I recommend that Australian pollsters go Carmelo Anthony.

Battle for the headline: Hype and the effect of statistical significance on the ability of journalists to engage in critical thinking

A few people pointed me to this article, “Battle for the thermostat: Gender and the effect of temperature on cognitive performance,” which received some uncritical press coverage here and here. And, of course, on NPR.

“543 students in Berlin, Germany” . . . good enuf to make general statements about men and women, I guess! I wonder if people dress differently in different places . . . .

Padres need Stan

Cody Zupnick writes:

I’m working in baseball research for the San Diego Padres, and we’re looking for new people, potentially with Stan experience. Would you mind seeing if any of your readers have any interest?

Cool!

Epic Pubpeer thread continues

Here. (background here and here)

“I’m sick on account I just ate a TV dinner.”

I recently read “The Shadow in the Garden,” a book by James Atlas that’s a mix of memoir about his experiences as a biographer of poet Delmore Schwartz and novelist Saul Bellow, and various reflections and anecdotes about biography-writing more generally.

I enjoyed the book so much that I’m pretty much just gonna have a post with long quotes from it. This is a labor of love, because (a) I don’t think these sorts of posts get many readers, and (b) it won’t even make Atlas himself happy, as he died a couple years after writing this book.

Before going on, let me say that Atlas reminds me of David Owen or, at a more exalted level, George Orwell: a common person who is a sort of stand-in for the reader. There’s something appealing about this regular-guy thing. (See here for further discussion of this concept.)

OK, now on to the quotes:

p.45:

“The art is in what’s made up.” Well put.

pp.47-48, writing about details in a letter written by Schwartz:

They form not “another piece of the puzzle”—the pieces are infinite and in any case can’t be put together . . .

“And in any case can’t be put together . . .” Indeed.

p.59:

I’m not naive: that I am approaching the end of my time in this world doesn’t mean the world is approaching it’s end. Forgive me for indulging in this common—no, universal—preconception: how many of us have the fortitude to see things as they really are?

I think all of us have the fortitude to see some things as they really are. It’s just that different people see different things.

p.62:

Sometimes, out on the trail, you could go too far—like the night I got lost in the wilds of rural New Jersey during a snowstorm.

Rural New Jersey! Charmingly local of him.

On p.65 he quotes the poem During December’s Death, “one of the last poems included in [Schwartz’s] collection Summer Knowledge:

This doesn’t quite motivate me to read more poems by Delmore Schwartz, but I do feel like I got something out of that one.

p.68:

“I’m sick on account I just ate a TV dinner.”

p.77:

But the kicker comes on the next page:

“Would it have killed the biographer to nail down this fact?” I love Atlas’s voice here.

p.106, in a footnote:

Another contemporary biographer of Charlemagne, the memorably named Notker the Stammerer, was almost defiantly insouciant about his editorial methods. “Since the occasion has offered itself, although they have nothing to do with my subject matter, it does not seem to be a bad idea to add these two stories to my official narrative, together with a few more which happened at the same time and are worthy of being recorded.” Note to Notker: Don’t try this at The New Yorker.

The New Yorker does make mistakes—I notice it sometimes when they mangle political statistics—still, that was a funny line.

p.114, discussing a book by biographer Ian Hamilton:

I’ve decided to quote from it at inordinate length: why struggle over some lame paraphrase with a writer as good as Hamilton?

p.138, talking about the great Dwight Macdonald:

“Just read your excellent wrecking job on that academic bronze-ass Bruccoli’s hagiography of O’Hara,” he wrote me . . .

Followed by this delightful footnote:

I’m not sure what Dwight meant by this word, which recurs in his correspondence: its dictionary definition (minus the “ass”) is “spiritual person” or “Buddhist,” but Dwight gave it a perplexingly negative spin. Maybe such types were anathema to his practical mind.

“Minus the ‘ass'” . . . I love that!

Just as an aside, I think John O’Hara is currently underrated, not so much as a writer (but he is a pretty good writer) but as an influence. A few years ago I was disappointed to see a whole article on John Updike written by the estimable Louis Menand that didn’t mention O’Hara even once. And then Patricia Lockwood did it again: an article all about Updike with no O’Hara, despite all their similarities.

p.141, continuing with Macdonald:

“A steady stream of bouillabaisse” . . . sure, in some sense this is writing by the numbers, following up a general description with telling detail. But Atlas does it so well! I’m loving it.

p.145, in a footnote:

As regular readers of this blog will recall, a “Feynman story” is any anecdote that someone tells that is structured so that the teller comes off as a genius and everyone else in the story comes off as an idiot. The above anecdote is an anti-Feynman story: it’s amusingly cringe-worthy and I admire Atlas for sharing it with us.

p.155, writing about Edmund Wilson:


This reminds me of what I wrote about Owen, Orwell—and Atlas!—above. When writing this passage about Wilson, was Atlas thinking about himself too? Maybe so, as he does write, “Wilson was my model.”

p.158:

On the essay’s last page, the thought occurs to Wilson that he might be “stranded,” out of touch with his own life and times.

Maybe this is true of all of us. I’m thinking of a mathematical argument here: Culture is a high-dimensional space, and you can’t be at the center of culture in all dimensions. And, even if you could, then you wouldn’t be the “typical set,” as they say in probability theory. Someone who is central in all dimensions is, in aggregate, extremely unusual. So in that sense maybe it’s no surprise that so many of us—all of us, maybe—feel ourselves to be out of time in some way or another.

p.161, Atlas refers to his “one published novel.” That’s very gently put. The use of the word “published” suggests that he had one or more other novels that never saw the light of day. Atlas here is acting as a biographer of himself, providing relevant information to us, the readers, but in a way that is polite and respectful to the subject, which in this case happens to be him. Kinda like when they announce the wedding date and the birth date of the first child, and it’s up to you to figure out that they are less than nine months apart.

p.163, Atlas discussing a Saul Bellow novel:

What good has it done the world? What good has it done him? What does he want? “But that’s just it—not a solitary thing. I am pretty well satisfied to be, to be just as it is willed, and for as long as I may remain in occupancy.”

I’m not the world’s biggest Saul Bellow fan (that would be Martin Amis): to me, Bellow’s writing is beautiful but hard to read, kinda like Melville. (I tried to read Moby Dick once but only got through the first few chapters before giving up from exhaustion.) Nonetheless, I reacted with joy to the above quote because it’s sooooo Bellow-like. You gotta admire someone with such a strong style.

p.166, there’s more:

Back in his apartment, I brought up the matter of the fishmonger’s cluelessness. “People aren’t aware of my presence,” Bellow said with apparent unfeigned equanimity. “What am I compared to the Cubs, the Bears?”

So Bellow-like again. Wonderful!

And on page 170, there’s more:

“There are enough people with their thumbprint on my windpipe.” I don’t know what it is, exactly, but it has the unmistakeable sound of Bellow.

And this:

I’m starting to like this Harris guy. “Stat or Staps or Stat or Stap.” Great rhythm he’s got there. It’s the kind of thing I could imagine a Bellow character saying I’m thinking that Harris spent so much time living inside Bellow that he started to write like him, or like an imitation of him.

p.171, in a footnote regarding details of biography, Atlas concludes, “Facts matter.” The stubbornness of facts: something Basbøll and I have spent a lot of time worrying over, in the course of shadowboxing with various plagiarists and bullshit artists.

pp.204-205:

Bellow saw four psychiatrists during his lifetime: Dr. Chester Raphael, a Reichian who practiced in Queens and who was the model for Dr. Sapir in his unfinished novel about Rosenfeld; Paul Meehl, a psychologist in Minneapolis he had consulted during the disintegration of his second marriage, when he was teaching at the University of Minnesota, Albert Ellis, the famous “sexologist” whom Bellow saw for what he once described as “pool room work,” or sexual technique; and Heinz Kohut.

Hey, wait a minute! Paul Meehl? Paul Meehl?? The Paul Meehl??? Yup.

And then this:

I [Atlas] had interviewed the first three, all of whom were willing, no doubt out of vanity, to violate patient/doctor (or psychologist) confidentiality.

Dayum. I guess each of us is complicated. Still, sad to hear this about one of my heroes. I would’ve hoped better of Meehl. Or maybe it was ok for him to share whatever stories he had with Atlas. Neither Meehl or Atlas is around now to discuss it.

On p.207, Atlas shares with us that Meehl is the model for Dr. Edvig in Bellow’s novel Herzog.

I think I’ll have to read Herzog now, just to learn more about Meehl. Did anyone ever write a biography of Meehl? A quick web search doesn’t reveal anything. The closest I can find is an autobiographical essay and a book, “Twelve Years of Correspondence With Paul Meehl: Tough Notes From a Gentle Genius,” by Donald R. Peterson. I don’t think either will give the insight that I’d get from some passages from Herzog. But we’ll see. I’ll report back to you once I’ve read it.

p.211: Bellow’s lawyer is named Walter Pozen. I know a law professor named David Pozen! Walter’s grandson, perhaps? Could be, no?

p.217, a footnote relating to Samuel Johnson’s sobriquet, the Great Cham:

For a long time, I thought this nickname had something to do with “champion,” but it’s actually an Anglicization of khan, someone who rules over a domain—in this case, literature.

I had no idea!

p.223, describing a family trip to Scotland:

We stayed in drafty castles and threadbare bed-and-breakfasts that would have made no Top Hundred Resorts list. . . . we were headed for “a country where no wheel has rolled,” as Johnson put it, the inns were “verminous,” the people “savages,” and the weather “dreary.”

“A country where no wheel has rolled” . . . there’s only one Samuel Johnson!

p.228, on the writing of the Life of Johnson:

Boswell had devised an ingenious method of transcription: having memorized as much as he could of a dialogue, he would scribble down rapid condensed notes, sometimes in Johnson’s presence, abbreviating all but key words—“the heads,” he called them, the ingredients of “portable soup,” “a kind of stock cube from which I could make up a broth, when the time came to feed.” It didn’t always congeal. “I have the substance,” he confided in his journal, “but the felicity of expression, the flavor, is not fully preserved unless taken instantly.” . . .

I know that feeling. I’m bad with exact quotes. If I don’t write it down word for word when I hear it, I can never reconstruct it just right. It’s so frustrating. I don’t think I could ever be a playwright. My dialogue generator just doesn’t work so well. George V. Higgins I ain’t.

Hey—I caught a mistake! On p.235, describing the apartment of sociologist Edward Shils: “On the top shelf was a long row of the Journal of American Sociology.” No! He’s thinking of the American Journal of Sociology. Or maybe the American Sociological Review. Funny how that just stuck out like a sore thumb in my reading.

p.239, reporting a conversation with Bellow:

He [Bellow] was having fun. A famous writer who had never got over the “Trotsky worship” of the 1930s he dismissed as “a grade-school radical.” A well-known Oxford academic was “a twit.” Of a literary critic who had made a career out of the Trancendentalists: “He thinks mystique is a perfume.” I [Atlas] marveled at this unguardedness, at once so calculated and so naive. Bellow never said, “Don’t quote me” or “This is off the record.”

So, OK, then who was the “grade-school radical”? Who were the Oxford academic and the literary critic? I want to know this (extremely low-level) gossip. Or maybe Atlas is making a point by not saying who he’s talking about?

p.242, I learn that Bellow was nearly jailed for perjury, for lying about his income in a divorce proceeding. Wow! I guess that five marriages pretty much sucked up all his ready cash.

p.249, getting to some of Bellow’s political leanings:

He launched into a tirade about ‘affirmative suction’—he had a weakness for bad puns . . .

I admire Atlas for giving an actual bad pun here. So often when we hear that someone likes bad puns, we get examples that are actually funny—“groaners,” but funny in their own way. But “affirmative suction”: that’s not funny, even in a so-bad-it’s-good sense. It’s just kinda crude and stupid. No big deal—all of us say stupid things from time to time, and if a biographer followed me around all day, I’m sure I’d give him plenty of raw material to make me look bad, if he so chose—still, it’s a telling detail and I appreciate that Atlas included it, rather than just portraying Bellow as a lovable curmudgeon.

p.254: “By the time I left, I was way over my limit of Bellow exposure—the amount of time I could spend around him before I got Bellow burnout. So much concentration, combined with the suppression of self, was exhausting.” I can believe that.

Also on p.254, Atlas gives this charming slice-of-life of the biographer:

Late one night in the autumn of 1993, I flew into O’Hare and got my car from the Avis lot. I loved this part of the job [emphasis added]: tossing my suitcase into the back, hanging up my jacket on the plastic hook, and driving off in a bright-colored Chevrolet Impala, fiddling with the dial until I found WFMT, “Chicago’s classical music” station, 98.7 on the dial.

Again, he’s Everyman. David Owen, not David Foster Wallace. George Orwell, not George Gershwin. Edmund Wilson, not Vladimir Nabokov. And I’m happy to be in his company.

p.265:

There is no such thing as Biography School, but if there were, Shils cold have been its dean. Among the lessons he taught me: you had to place your subject in a historical context . . . you had to make people sound authentic . . . you had to listen to what people said and be skeptical about pronouncements that sounded smart but on closer scrutiny meant nothing . . .

Above all, you had to get your facts straight, however trivial they seemed (“There is no streetcar on 51st Street”), because if you got a fact wrong, even if no one noticed, it would set off a vibration of wrongness that made everything around it, all the facts and quotes and speculations, feel somehow off.

Are you listening, Marc Hauser? Brian Wansink? Susan Fiske?

Probably not. James Atlas never gave a Ted talk.

p.283, in a footnote:

“Narrative truth can be defined as the criterion we use to decide when a certain experience has been captured to our satisfaction; it depends on continuity and closure and the extent to which the fit of the pieces takes on an aesthetic finality.” Narrative Truth and Historical Truth: Meaning and Interpretation in Psychoanalysis, by Donald P. Spence, a book every biographer should read.

Interesting. I don’t like the use of the word “truth” to mean “coherence,” but on the other hand, ultimately only truth is coherent—as Mark Twain famously put it, if you tell the truth you don’t have to remember anything—so maybe this is ok after all.

p.287, footnoting a description of Nabokov as a “control freak”:

I [Atlas] have circled around this phrase, deleting and restoring it several times. It feels somehow too idiomatic, and therefor inappropriate, even faintly insulting to a master of usage like Nabokov. But isn’t the goal in writing to approximate ordinary speech? And Nabokov was a control freak. Stet.

I just love so much that Atlas cared about getting this just right. I feel the same way about each of my paragraphs—including those in my blogs.

p.295, Atlas reveals himself—briefly:

It was an older crowd, verging on the geriatric, but there were lots of younger people, too, in their thirties and forties. Bellow was read now by a new generation; he still had the goods.

You gotta be kind of old yourself to think of people in their thirties and forties as the new generation. I mean, sure, literally they are the ages of the children or grandchildren of Bellow’s first readers. But still.

p.296, Atlas talks about Bellow, Updike, and Roth. In addition to being dead white males, all three of these authors are striking to me as being perpetual children, never parents. Sure, Updike had 4 kids and Bellow had 3. But in their writings, even when they’re older, they still seem to approach the world as curious or sensitive or petulant children. They never seem to have the parental view of the world. This seems so sad to me. Having kids, if you have them, is such a central part of life. To have children but not let this affect you . . . it’s just too bad (also discussed here).

p.297, Atlas calls the Bellow home and reaches Saul’s young wife:

It was Janis who answered: “This is Mrs. Bellow.” She was friendly when I announced myself. “Hello, Jim Atlas,” she said pertly . . .

A vivid description in just a few words. Well done, Jim Atlas.

p.301, as the biography-writing continues and gets more challenging:

“Bellow’s portrait was beginning to darken, like a negative exposed to light. Even his friends had unkind things to say. . . . I [Atlas] had a disagreeable interview with Mel Tumin . . . Tumin was hostile; he dwelled on his recent gallbaldder operation, disparaged biography (“There’s no such thing as truth”), and assured me that Bellow’s girlfriends at the University of Chicago “weren’t pretty.” . . .

That last bit’s kinda funny, someone getting back at an old pal by disparaging the looks of his college girlfriends. But I guess the real message here is not to do any interviews right after major surgery.

p.313:

“When is that book of yours coming out?” Bellow wrote me a few weeks before publication day. “I feel as if I should go off to Yemen.”

Yemen! Again, that just sounds sooooo Bellow. Amazing, that voice.

p.318: Atlas refers to someone informing him “with the kind of tedious precision that often attends recounting of wrongs . . .”

Indeed! I’ve done that sometimes, and I’m sure it’s tedious to others. I haven’t had a lot of wrongs done to me in my long life, but the ones that have, I’ll recount with tedious precision, that’s for sure.

p.326:

I see Maggie Simmons, and we embrace. Maggie maintained a close relationship with Bellow for half a century and was, according to many, the love of his life.

The love of Bellow’s life! Here we are on page 326, the book is almost over, and this is the first time we hear about her? Or maybe Atlas is doing this on purpose, too keep adding twists to the story all the way to the very end? In any case, I feel a bit manipulated to have only heard about this person right now, so late in the book.

p.333: Atlas quotes critic James Wood as saying of Bellow being an inattentive parent:

“How, really, could the drama of paternity have competed with the drama of creativity?” asked Wood. For Bellow, the writing was the living.

I agree with Atlas that this is ridiculous. For one thing, it’s not like you need to be a creative artist to be a bad parents. Lots of people are bad parents without creating anything at all. Parenting takes work, that’s all.

p.347:

History is ever regenerative. New subjects arise as the old ones disappear—including people we never heard of Virginia Woolf asked: “Is not anyone who has lived a life, and left a record of that life, worthy of biography—the failures as well as the successes, the humble as well as the illustrious?” What about all the people I’ve known who didn’t leave records of their own lives? Don’t they deserve biographies, too? Sing now of Scottie A., my best friend when I was growing up in Highland Park, Illinois, who built snow forts with me in the days when there was snow, and who died of cancer at the age of fifty-eight, which maybe wasn’t such a terrible thing as he was about to be put on trial for securities fraud. . . .

That’s a cheap laugh, but a laugh nonetheless. OK, sad too. Anyway, Atlas makes this point well with that fine one-sentence mini-biography of the unfortunate Scottie A.

And, in a footnote on p.350:

It reminds me [Atlas] of the passage in Lord Jim where the young sailor on the deck of a ship bound for the East watches “the big ships departing, the broad-beamed ferries constantly on the move, the little boats floating far below his feet, with the hazy splendor of the sea in the distance, and the hope of a stirring life in the world of adventure.”

I’ve never read Lord Jim! I guess I should.

And now we’ve come to the end.

I’m so glad Atlas put in the effort to write this book. And it all makes me so nostalgic. I think I’m gonna read Catcher in the Rye again. And I really wish Atlas were still alive to read this.

Finally, if you’ll allos me a Geoff Dyer moment, I’ll say that I find Atlas’s book about his biography more compelling than Bellow’s novels—and also more compelling than I imagine Atlas’s biography of Bellow to be. But at this point I’m curious enough that expect I will read that biography. I doubt I’ll get around to reading about Delmore Schwartz, though: his story just sounds too sad.

P.S. Jeez—I spent 2 hours writing this. Whoever of you reads this to the end . . . just remember, I wrote it for you.

Have prices have risen more quickly for people at the bottom of the income distribution than for those at the top? Lefty window-breakers wait impatiently while economists struggle to resolve this dispute.

Palko points us to this post by Mike Konczal pointing to this news article by Annie Lowrey reporting on research by Christopher Wimer, Sophie Collyer, and Xavier Jaravel finding that “prices have risen more quickly for people at the bottom of the income distribution than for those at the top.”

This new result counters an earlier study that got a bit of attention back in 2008, which I’ll get back to in a bit.

Before getting to the main topic of this post, which has nothing really to do with income inequality, let me talk about all the statistical and political challenges here.

First the political challenges. All the people mentioned above are coming from the left or center-left in the U.S. context, generally supporting economic redistribution, government regulation of the economy, and taking the side of labor in disputes with business. All these positions are relative, of course—I don’t think there are any Soviet-style communists in the room—but they have a general motivation to report that things are relatively worse for the poor. This would generally be the case, and it’s even more so during a Republican administration.

If a Democrat is president, the political motivations are more mixed: on one hand, people on the left will still want to emphasize the difficulties of being poor, but at the same time, people on the right might want to talk about rising inequality as a way to discredit the Democrats.

I don’t want to overplay this political point here. People on all sides of this discussion may well be addressing the data as honestly as they can, but we should still recognizing their political incentives.

The second political challenge is that I know two of the authors of this paper. I work with Chris Wimer and Sophie Collyer, and we’ve spent a lot of time talking about issues of measurement and poverty within households. I haven’t been involved in the particular work being discussed here—my contributions are with a survey of New York City families, and this appears to be a national study—but in any case I have this professional connection that you should be aware of.

The statistical challenge is that definitions of poverty depend, in large part, on survey responses and survey adjustments. I have no quick answers here, and I’ve not read this new study in detail. I just know that in economics, the data are not simply sitting there; they need to be constructed. And this can drive some of the differences in conclusions.

And there’s more. Going back to Konczal’s above-linked post and you’ll see a link to a post from Will Wilkinson in 2008 pointing to a Freakonomics blog post by Steven Levitt from that year. The link to Levitt’s post no longer seems to work, and the new link is missing the comment section, so I’ll point you to the Internet Archive version, where the rogue economist writes:

Inequality is growing in the United States. The data say so. Knowledgeable experts like Ben Bernanke say so. Ask just about any economist and they will agree. . . . According to two of my University of Chicago colleagues, Christian Broda and John Romalis, everyone is wrong.

Inequality has not grown over the last decade — at least not very much. What we think is a rise in inequality is merely an artifact of how we measure things.

As improbable as it may seem, I believe them.

Their argument could hardly be simpler. . . .

Let’s dissect this. The statement has to be an “improbable” surprise—“just about any economist” thinks the opposite—but yet it “could hardly be simpler.”

Levitt continues into a digression regarding “lefties” and “the sorts of people who break store windows in Davos,” which doesn’t seem to be so relevant, given that earlier he’d said that this new study was contradicting everyone, from Davos window-breakers to “just about any economist.”

Konczal also links to this 2008 post by sociologist Lane Kenworthy, who summarizes the argument of Broda and Romalis:

Income inequality has increased over time. But analysis of consumption data indicates that people with low incomes are more likely than those with high incomes to buy inexpensive, low-quality goods. In part because those goods increasingly are produced in China, their prices rose less between 1994 and 2005 than did the prices of goods the rich tend to consume. Hence the standard measure of inequality, which is based on income rather than consumption, greatly overstates the degree to which inequality increased. The incomes of the rich rose more than those of the poor, but because the cost of living increased more for the rich than for the poor, things more or less evened out.

The discussion then turns on the question of whether rich people are getting anything for these expensive purchases. Or, to put it another way, whether poor people are suffering for not being able to afford nice things. Kenworthy argues no.

Then again, I’ve collaborated with Kenworthy so maybe I’m more likely to hear out his arguments.

So, to summarize:

– In 2008 there was agreement, or tentative agreement, regarding the claim that the prices of things that poor people bought were going up more slowly than the prices of things that rich people bought. But there was disagreement about whether this should be taken to imply that consumption inequality was decreasing.

– As of 2019, it seems that the prices of things that poor people bought have been going up faster than the prices of things that rich people bought.

Is this a contradiction? I’m not sure. The time periods of the two studies differ: the 2008 study covers the 1994-2005 period, and the 2019 study covers the 2004-2018 period. So it’s possible that the poor people’s products had a relative decline in price for one decade, followed by a relative increase during the next. Also, the two studies are using different methods. It would be good if someone could apply the methods of the first study to the data of the second study, and vice-versa.

Putting this all together, you can see that the statistics and economics questions connect only tangentially to the political questions. Levitt was sharing an empirical claim, but it only took him a few paragraphs to start ranting about window-breaking leftists. Kenworthy accepted the empirical claim but refused to draw the same political conclusion. Now the empirical claim goes the other way, so the arguments about relevance can be spun in the opposite direction.

In saying all this, I’m not trying to imply that the economic questions are unimportant. I think it’s worth trying to measure these things carefully, even while interpretations can differ.

In this case, it’s a lot less effort for me to write a thousand words about the dispute, than to carefully read the two research articles and try to figure out exactly what’s going on. I skimmed through the Broda and Romalis article but then I got to Figure 4A which scared the hell out of me!

P.S. More here from Elena Botella.

Columbia statistics department is hiring!

Official announcement is below.

Please please apply to these faculty and postdoc positions. We really need some people who do serious applied work, especially in social sciences. Obv these will be competitive, but please give it a shot, because we’d like to have some strong applied candidates in the mix for all of these positions. Thanks!

The Department of Statistics at Columbia University is looking to fill multiple faculty positions. Please see here for full listings.

Tenure-Track Assistant Professor (review of applications begins on November 29, 2019) This is a tenure-track Assistant Professor position to begin July 1, 2020. A Ph.D. in statistics or a related field is required. Candidates will be expected to sustain an active research and publication agenda and to teach in the departmental undergraduate and graduate programs. The field of research is open to any area of statistics and probability.

Assistant Professor (limited-term) (*multiple openings*; review of applications begins on December 2, 2019) These are four-year term positions at the rank of Assistant Professor to begin July 1, 2020. A Ph.D. in statistics or a related field is required, as is a commitment to high-quality research and teaching in statistics and/or probability. Candidates will be expected to sustain an active research and publication agenda and to teach in the departmental undergraduate and graduate programs. Candidates with expertise in machine learning, big data, mathematical finance, and probability theory are particularly encouraged to apply.

Lecturer in Discipline (review of applications begins on January 6, 2020) This is a full-time faculty appointment with multi-year renewals contingent on successful reviews. This position is to contribute to the Departmental educational mission at the undergraduate and masters level.

The department currently consists of 35 faculty members and 59 Ph.D. students. The department has been expanding rapidly and, like the University itself, is an extraordinarily vibrant academic community. We are especially interested in candidates who, through their research, teaching and/or service, will contribute to the diversity and excellence of the academic community.

In addition to the above faculty positions, the department is also considering applications to our Distinguished Postdoctoral Fellowships in Statistics. Review of applications begins on January 13, 2020. See stat.columbia.edu/faculty-positions for details.

Women and minorities are especially encouraged to apply. For further information about the department and our activities, centers, research areas, and curricular programs, please go to our web page at http://www.stat.columbia.edu

P.S. Thanks to Zad Chow for the above photo, which demonstrates how relaxed you’ll be as one of our colleagues here at Columbia.

The incentives are all wrong (causal inference edition)

I was talking with some people the other day about bad regression discontinuity analyses (see this paper for some statistical background on the problems with these inferences), examples where the fitted model just makes no sense.

The people talking with me asked the question: OK, we agree that the published analysis was no good. What would I have done instead? My response was that I’d consider the problem as a natural experiment: a certain policy was done in some cities and not others, so compare the outcome (in this case, life expectancy) in exposed and unexposed cities, and then adjust for differences between the two groups. A challenge here is the discontinuity—the policy was implemented north of the river but not south—and that’s a challenge, but this sort of thing arises in many natural experiments. You have to model things in some way, make some assumps, no way around it. From this perspective, though, the key is that this “forcing variable” is just one of the many ways in which the exposed and unexposed cities can differ.

After I described this possible plan of analysis, the people talking with me agreed that it was reasonable, but they argued that such an analysis could never have been published in a top journal. They argued that the apparently clean causal identification of the regression discontinuity analysis made the result publishable in a way that a straightforward observational study would not be.

Maybe they’re right.

If so, that’s really frustrating. We’ve talked a lot about researchers’ incentives to find statistical significance, to hype their claims and not back down from error, etc., as well as flat-out ignorance, as in the above example, researchers naively thinking that some statistical trick can solve their data problems. But this latest thing is worse: the idea that a better analysis would have a lower chance of being published in a top journal, for the very reasons that makes it better. Talk about counterfactuals and perverse incentives. How horrible.

Filling/emptying the half empty/full glass of profitable science: Different views on retiring versus retaining thresholds for statistical significance.

Unless you are new to this blog, you likely will know what this is about.

Now, by profitable science in the title is meant repeatedly producing logically good explanations  which “through subjection to the test of experiment experiment, to lead to the avoidance of all surprise and to the establishment of a habit of positive expectation that shall not be disappointed.” CS Peirce

It all started with a Nature commentary by Valentin Amrhein, Sander Greenland, and Blake McShane. Then the discussion , then thinking about it , then an argument that it is sensible and practical , then an example of statistical significance not working and then a dissenting opinion by Deborah Mayo .

Notice the lack of finally!

However, Valentin Amrhein, Sander Greenland, and Blake McShane have responded with a focused and concise discernment why they think retiring statistical significance will fill up the glass of profitable science while maintaining hard default thresholds for declaring statistical significance will continue to empty it. Statistical significance gives bias a free pass. This is their just published letter to the editor (JPA Ioannidis) on TA Hardwicke and JPA Ioannidis’ Petitions in scientific argumentation: Dissecting the request to retire statistical significance, where Hardwicke and Ioannidis argued (almost) the exact opposite.

“In contrast to Ioannidis, we and others hold that it is using – not retiring – statistical significance as a “filtering process” or “gatekeeper” that “gives bias a free pass”. “

A two sentence excerpt that I liked the most was “Instead, it [retiring statistical significance] encourages honest description of all results and humility about conclusions, thereby reducing selection and publication biases. The aim of single studies should be to report uncensored information that can later be used to make more general conclusions based on cumulative evidence from multiple studies.”

However, the full letter to the editor is only slightly longer than two pages – so should be read in full – Statistical significance gives bias a free pass.

I also can’t help but wonder how much of the discussion that ensued from the initial  Nature commentary could have been avoided if less strict page limitations had been allowed.

Now it may seem strange for an editor who is also an author on the paper drawing a critical letter to the editor – accepts it. It happens, but not always. I also submitted a letter to the editor on this same paper and the same editor rejected it without giving a specific reason. That full letter of mine is below for those who might be interested.

My letter was less focused but had three main points. Someone with a strong position on a topic that undertakes to do a survey themselves displaces the opportunity for others without such strong positions to learn more, univariate  summaries of responses can be misleading and pre-registration (minor) violations and comments (only given in the appendix) can provided insight into the quality of the design and execution of thw survey. For instance, the authors had anticipated analyzing nominal responses with correlation analysis.

Read more.

Continue reading ‘Filling/emptying the half empty/full glass of profitable science: Different views on retiring versus retaining thresholds for statistical significance.’ »

“The paper has been blind peer-reviewed and published in a highly reputable journal, which is the gold standard in scientific corroboration. Thus, all protocol was followed to the letter and the work is officially supported.”

Robert MacDonald points us to this news article by Esther Addley:

It’s another example of what’s probably bad science being published in a major journal, where other researchers point out its major flaws and the author doubles down.

In this case, the University of Bristol has an interesting reaction. It’s pulled down its article praising the research, which is good, but it’s also distancing itself from him. Whereas it was originally very happy to associate itself with this work, now they’re saying it was done independently and has nothing to do with Bristol. I’m actually pretty disappointed in that, partly because they can’t have it both ways but also because it seems (to me) like it weakens the university-faculty relationship.

The author’s response (dripping with arrogance) is a concise summary of the sort of “published research is unquestionable” mentality you’ve been talking about. As quoted in the article:

The paper has been blind peer-reviewed and published in a highly reputable journal, which is the gold standard in scientific corroboration. Thus, all protocol was followed to the letter and the work is officially supported. Given time, many scholars will have used the solution for their own research of the manuscript and published their own papers, so the small tide of resistance will wane.

I find it particularly interesting that he’s arguing not that others will see he’s right, but that other people will start using his results — so I guess the resistance will dry up because his results will become embedded in the fabric of the whole field.

Yup. The research incumbency rule. Just horrible.

How to teach sensible elementary statistics to lower-division undergraduates?

Kevin Carlson writes:

Though my graduate education is in mathematics, I teach elementary statistics to lower-division undergraduates.

The traditional elementary statistics curriculum culminates in confidence intervals and hypothesis tests. Most students can learn to perform these tests, but few understand them. It seems to me that there’s a great opportunity to reform the elementary curriculum along Bayesian lines, but I also see no texts that attempt to bring Bayesian techniques below the prerequisite level of calculus and linear algebra. Do you think it’s currently possible to teach elementary stats in a Bayesian way? If not now, what might need to happen before this became possible?

My reply:

I do think there’s a better way to teach introductory statistics but I’m not quite there yet. I think we’d want to do it using simulation, but inference is a sticking point.

To start with, let’s consider three levels of intro stat:

1. The most basic, “stats for poets” class that provides an overview but few skills and no derivations. Currently this seems to usually be taught as a baby version of a theoretical statistics class, and that doesn’t make sense. Instead I’m thinking of a course where each week is a different application area (economics, psychology, political science, medicine, sports, etc.) and then the concepts get introduced in the context of applications. Methods would focus on graphics and simulation.

2. The statistics course that would be taken by students in social science or biology. Details would depend on the subject area, but key methods would be comparisons/regression/anova, simple design of experiments and bias adjustment, and, again, simulation and graphics. The challenge here is that we’d want some inference (estimates and standard errors, and, at the theoretical level, discussions of bias and variance) but this all relies on concepts such as expectation, variance, and some version of Bayesian inference, and all of these can only be taught at a shallow level.

3. A statistics class with mathematical derivations. For this you should be able to teach the material any way you want, but in practice these classes have a pretty shallow mathematical level and give pseudo-proofs of the key results. I don’t think there’s any way to teach statistics rigorously in one semester from scratch. You really need that one semester on probability theory first.

Option #2 above is closest to what I teach, and it’s what Jennifer and Aki and I do in our forthcoming book, Regression and Other Stories. We do lots of computing, and we keep the math to a minimum. Bayes is presented as a way of propagating error in predictions, and a way to include prior information in an analysis. We don’t do any integrals.

I’m not yet sure how to do the intro stat course. Regression and Other Stories starts from scratch, but the students who take that class have already taken introductory statistics somewhere else.

For that first course, I think we need to teach the methods and the concepts, without pretending to have the derivations. Students who want the derivations can go back and learn probability theory and theoretical statistics.

Hey, Stan power users! PlayStation is Hiring.

Imad writes:

The Customer Lifecycle Management team at PlayStation is looking to hire a Senior Data Modeler (i.e. Data Scientist). DM me if you like building behavioral models and working with terabytes of data. You’ll have the opportunity use whatever tools you want (e.g. Stan) to build your models.

I’m not into videogames myself, but for the right person I’m guessing this job would be a lot of fun.

The dropout rate in his survey is over 60%. What should he do? I suggest MRP.

Alon Honig writes:

I work for a cpg company that conducts longitudinal surveys for analysis of customer behavior. In particular they wanted to know how people are interacting with our product. Unfortunately the designers of these surveys put so many questions (100+) that the dropout rate (those that did not complete the survey) was over 60%. The researchers of the data (all had an academic background) told me that this drop rate was in fact quite normal for such studies. In the past when I did marketing analysis we would start getting concerned about a dataset when the dropout rate was above 20%. That is because we knew there was something strange about the remaining population, making inference on the general population faulty. The research team acknowledged the issue bit didn’t seem to be concerned about the bias of their findings.

I wanted to know how we should think about the dropout rate after conducting a survey. Does this mean we should create a new one? Or should we adjust our results to account for this? What is a reasonable rate anyways?

My quick answer is to do multilevel regression and poststratification to adjust for known differences between sample and population. Use multilevel regression to model each outcome of interest, conditional on whatever variables you think are predictive of dropout and the outcome. In your regression, include interactions of these predictors with anything you care about in your modeling. Then use poststratification to take the predictions from your model and average them over your population.

And, yes, you should still try your best to minimize dropout, and to identify what factors determine dropout so that you can try to measure them and include them in your model.

P.S. Just to clarify: I’m not saying that MRP automatically solves this problem. What I’m saying is that MRP is a framework that can allow us to attack the problem.

The climate economics echo chamber: Gremlins and the people (including a Nobel prize winner) who support them

Jay Coggins, a professor of applied economics at the university of Minnesota, writes in with some thoughts about serious problems of within the field of environmental economics:

Your latest on Tol [a discussion of a really bad paper he published in The Review of Environmental Economics and Policy, “the official journal of the Association of Environmental and Resource Economists and the European Association of Environmental and Resource Economists.”] got me [Coggins] thinking. People might suppose the gremlins hubbub dinged his reputation, but no. Tol is still considered an elite climate economist. In 2016, when a lineup of heavy climate-econ hitters wrote a Policy Forum piece for Science, Tol was included as a co-author. That paper is safely post-gremlins; his reputation remains intact.

Tol’s damage-related papers get published partly because the academic climate-econ crowd is a bit of an echo chamber. My conjecture is they review each others’ papers and they tell editors to print them. What else is an editor to do? The problem is not just that Richard Tol can write authentic academese.

Why do I think it’s an echo chamber? Because Tol said so himself, in a 2013 JED&C paper from his continuing series. Here he’s explaining why the uncertainty around his results might be larger than it appears: “[T]he researchers who published impact estimates are from a small and close-knit community who may be subject to group-thinking, peer pressure and self-censoring.”

Also, the method Tol introduced in the 2009 JEP paper remains a key part of DICE, the integrated assessment model (IAM) for which William Nordhaus won the 2018 economics Nobel. I’m not sure the connection between DICE and Tol’s paper is much appreciated.

In the 2013 version of DICE, Nordhaus based his monetary climate-damage function on Tol’s 2009 JEP results. That function connects any amount of warming to the resulting loss in GDP. Here are Nordhaus and Storzk, p. 11 of the 2013 DICE user’s manual: “DICE-2013R uses estimates of monetized damages from the Tol (2009) survey as the starting point. . . . I [sic] have added an adjustment of 25 percent of the monetized damages to reflect these non-monetized impacts. While this is consistent with the estimates from other studies (see Hope 2011, Anthoff and Tol 2010, and FUND 2013), it is recognized that this is largely a judgmental adjustment.” Notice that he uses FUND, Tol’s IAM, as a benchmark. My function looks like Tol’s, he seems to be saying, so I’m good. This is the echo chamber.

When describing the revised 2016 version, in his 2017 PNAS paper, on p. 1519 Nordhaus writes: “The damage function was revised in the 2016 version to reflect new findings. The 2013 version relied on estimates of monetized damages from Tol (2009). It turns out that that survey contained several numerical errors (JEP Editorial Note, 2015). The current version continues to rely on existing damage studies, but these were collected by Andrew Moffat and the author and independently verified.”

So, as of 2013, Nordhaus thought highly enough of Tol’s 2009 paper to make those “estimates” the starting point for his DICE damage function. Then he used Tol’s FUND as a comparison to check whether his final damage function looked right. In 2016, when the gremlins paper had been discredited, he pivoted and conducted his own exercise, rooted in Tol’s idea but more sophisticated, and with very similar results. He also continues (p. 3) to apply that mysterious extra increment, just because: “We make a judgmental adjustment of 25% to cover unquantified sectors.”

The Nordhaus-Moffat study incorporates more impact numbers, but is still largely based on IAM results, including those of previous DICE versions. Nordhaus appears to be happy with Tol’s fundamental approach, just not with the execution. But note: for any level of warming, monetary climate damages are smaller in the 2016 DICE, based upon Nordhaus’s results (−0.236% GDP lost per degree warming squared), than in the 2013 DICE, based upon Tol’s flawed 2009 results (−0.267%). Nordhaus’s statistical method is described in the SI to the PNAS paper; I expect a statistician like yourself will find it interesting reading. Also, “independently verified” by whom?

I know you understand that the “data” Tol continues to use for his damage papers, including the REEP paper you reference, are just the numbers that come out of a subset of IAMs, including his own. Nordhaus 2017 used numbers from a different subset of IAMs and some additional studies. Only a few of these numbers can be said to come from a data-generating process based in actual evidence. There is little empirical basis for it, not of the kind a data person like you would recognize.

And consider for a moment the self-referential nature of this enterprise. A bunch of people, including Tol and Nordhaus, build IAMs and produce numbers purporting to show the economic damage from a given level of warming. Those numbers become a “dataset” that Tol uses to obtain a statistical relationship that forms the basis of the 2013 DICE damage function. Nordhaus, in turn, uses similar numbers from IAMs, again including both his and Tol’s, to obtain a statistical relationship that forms the basis of the 2016 DICE damage function.

I’m not a fan of climate-econ IAMs, and I’m not alone. Robert Pindyck has been hating on them, loudly, for several years. Pindyck levels a series of specific criticisms at DICE and the other IAMs, one of which is precisely that the damage functions are not empirically based. “W]hen it comes to the damage function,” Pindyck writes, “we know virtually nothing—there is no theory and are no data that we can draw from.” That’s changing, as more people try to quantify empirically the effect of warming on economic performance, including into the distant future.

My main complaint is this: in the base DICE configuration, off the shelf, the “optimal” level of warming is 4.08 degrees C. This, says Nordhaus, is the sweet spot, as good as it gets. His damage function is just one element driving that result. But compare his number to the aspirational Paris goal of 1.5 degrees warming, or the hard Paris goal of 2 degrees. The recommendations of DICE on one hand, and almost all the world’s countries and elite climate scientists on the other, cannot both be right.

Tol and Sokal teach different lessons. Unlike Sokal, Tol is writing for his own crowd as an esteemed insider. If the method he introduced in 2009 was ever to be truly discredited, the damage function in DICE, as currently configured, would crumble. Also unlike Sokal, the Tol and Nordhaus models really matter. Theirs and one other IAM, Chris Hope’s PAGE, together form the backbone of official estimates of the social cost of carbon, a major focus of discourse around U.S. climate policy. Unlike Sokal’s gambit, this is no academic game.

tl;dr: The real problem is not the gremlins, so much as the people who tolerate them. Gremlins will always be with us: there will always be lazy scholars who are better writers than researchers, who can use the tricks of the trade to three-card-monte their mistakes. The big problem comes when leaders in the field, Nobel prize winners even, people who should know better, decide that they’d rather play nice than go for the truth.

It’s really too bad. Environmental economics is important, more important than some people’s careers or their desire to have an h-index of 3000 or meet the king of Sweden or whatever.

What’s going on here? Tol’s part of the club. People in the club think that other people in the club are absolutely brilliant. Tol’s work must be wonderful, right? He gets so many citations. This is an interesting example because it goes beyond the political left and right: it’s more of a question of the in’s and the out’s. These guys are on the inside, trapped in their own bubble.

Again, it’s not about Tol or Nordhaus in particular: they’re just examples of the larger problem of the circularity of scientific citation and prestige, at least in this subfield.

Let me scream for a moment: THIS IS A SCANDAL!

I’m no expert

A journalist contacted me and wanted me to answer some questions. I said, sure, send them over by email, and here’s what came:

** The European Union has announced that the Special Financial Mechanism (SPV) will be implemented soon. What is your assessment of this mechanism? And how much do you think PSV could help Iran countering US sanctions?

** Recently, France and Germany volunteered to host this mechanism. Do you think having Europe’s biggest economies as host will guarantee SPV’s implementation?

** Do you think SPV could solve Iran’s oil trading problems?

** The United States exempts eight countries from Iran oil sanctions and recently extended the Iraqi exemption. Do you consider this a kind of retreat? And do you think US will be forced to extend exemption for other countries too?

** In general, what is do think of the Nuclear Deal which has been signed between Iran and 5+1 countries in 2015? Despite numerous inspections by International Atomic Energy Agency that confirmed every Iran’s compliance with the provisions of the deal, why does Tramp call it the worst deal that ever made? Is it just because the Obama administration made this deal or something else is in his mind?

** White House officials have accused Iran of violating human rights and claim that sanctions do not include humanitarian activities but as the international community acknowledges, sanctions directly target the lives of the Iranian people. What is your assessment of this dual behavior?

** How do you think the political future of someone such as Donald Tramp, who did not adhere to any of his international obligations, would be like?

** White House officials have accused Iran and some other countries of supporting terrorism, but they themselves are the ones that dines with terrorists, for example John Bolton, have participated in meetings of some anti-Iran terrorist groups that have been responsible for the death of thousands of people, and sometimes finance them .Do you see this as a dual behavior?

** How do you assess Iran’s role in combating terrorism in Middle East, especially in the fight against ISIS? How does Tramp’s recent decision to exit from Syria affect the regional role of Iran, Russia and Turkey? Some say that US decision to withdraw troops from Syria was a christmas gift from Trump to these three, do you agree?

**While SWIFT decided to cut off some Iranian banks from its services, Do you think Iran, Russia and Turkey are able to launch a SWIFT-like format without using US dollars?

**Will the extermination of Nuclear Deal lead to a new crisis in Middle East? In your opinion What is going to be the worst or most alarming crisis in the world and Middle East in the New Year?

**How much energy factor affect US interventions in the Middle East?

**How serious are the differences between US and Europe in supporting the Nuclear Deal? What do you think will happen to European-American relations if these differences are not resolved?

**What effect will the US withdraw from Afghanistan might has on the Subcontinent and Iran-US-Afghan triangle equations? What is the impact of this issue on counter-terrorism approaches in Afghanistan and Pakistan?

**What is your overall assessment of Iran’s 2019 budget and how could Iran decrease budget dependency on oil and oil prices?

**What is the prospect of renting Chabahar port to India as a port exempt from sanctions? What is the inpact of this agreement on the washington and New Delhi relations? India seeks to extend sanctions relief, will it obtain US permits?

**How do you assess the dismissals and changes in the Tramp government? Some argue that during his time, Trump has been trying to put away people that have a far more positive attitude towards Iran, and select others like Bolton, who is famous for his anti-Iran policies. How do you analyze these changes?

**What is your opinion on the appointance of Heather Nauert as the United States’ ambassador to the United Nations and the adoption of anti-Iranian resolutions?

**Following the sanctions and Trump’s immigration policies, many Iranians, including Iranian students who are planning to study or already studying in the United States, and especially some patients faces serious problems. As a university professor What is your opinion about these problems?

**Along with changes in the White House, Riyadh saw some changes too. How do you assess the departure of Saudi Foreign Minister? Do both Riyadh and Washington benefit from this decision? Does the departure of adel al jubeir brings hope for a better relationship between Tehran and Riyadh?

**Given the new combination of the congress and the fact that from January these people are supposed to enter the congress and begin their work, and ongoing US government shutdown, do you think that tensions between congress and White House will continue? Will the issue of Iran be one of the main challenges between trump and congress after Democrats take charge?

I replied that I have no idea, as I know nothing about the above topics. I’m no Freud expert.

But then this got me thinking . . . from the journalist’s perspective, I guess it wasn’t so important if I was an expert, as long as I had credentials and was willing to answer the questions. But how many times are people asked these sorts of questions that they’re not qualified to answer?

I remember, years ago, when I was a student and young professor, occasionally seeing statisticians quoted in the newspaper on topics they knew nothing about. They were just bullshitting. Then after the 2000 election, the Gore campaign used a statistical expert who knew nothing about the analysis of political data. That probably wasn’t so consequential—I’m guessing the Supreme Court was going to vote on party lines in any case—but, still, it bugged me that someone was willing to act like an expert on a topic he wasn’t fully conversant with, on a case that was so important. So I’ve always tried to be careful to convey the limitations of my expertise. Lots of people don’t, though. And, as the above exchange illustrates, the demand for expertise can exceed the available supply.

When did “by” become “after”?

This post is by Phil Price, not Andrew.

I just did a Google News search for “injured after”, and these are some of the headlines that came up:

16-year-old bicyclist seriously injured after being hit by car in Norfolk
At least 1 injured after high-speed crash in Bridgeport
Teen injured after falling off rooftop
Driver injured after crash down embankment near Sherwood
Two injured after shooting on Indy’s east side

There are many more like these. They all irritate me. What, the 16-year-old cyclist was uninjured in the crash, but after the crash he got hurt somehow? The high-speed crash in Bridgeport didn’t injure someone, but someone got injured afterwards? The only one of these that I believe could be factually correct is “Teen injured after falling off rooftop”, since, yeah, ha ha, it wasn’t falling off the rooftop that hurt him, it was hitting the ground a couple of seconds later.

These should be “16-year-old bicyclist seriously injured by being hit by car”, “At least one injured in high-speed crash”, “Teen injured by fall off rooftop”, “Driver injured in crash down embankment”, and “Two injured in shooting”.

I first noticed this factually incorrect use of ‘after’ a couple of years ago but it was pretty uncommon. Now it seems to have taken over, or at least it seems to have caught up with “by” and “when”, as in ‘injured by crash’ and ‘injured in crash’ for example. And don’t bother telling me Shakespeare used ‘after’ this way, or Austen or Chaucer or Milton, it’s still wrong.

My friends, including my writer friends and editor friends, tell me to get over it, what’s the big deal, you know what they mean. Well, first of all I don’t always know what they mean, there have been times I’ve been genuinely unclear on what is being described. But also, just because I understand it doesn’t mean it’s right. If I say “this is just not the write way to phrase it”, hey, you know what I mean, but that sentence is still wrong.

I may be the only one who cares, but by god I am not giving up this battle. This new usage stinks.

Now all you kids get off my lawn.

This post is by Phil.

The 5,000 Retractions of Dr. E

Rigor, of course, but put a lid on the aggression & call off the social media hate mobs.

Software for multilevel conjoint analysis in marketing

Someone writes:

The CBC-HB and HB-Reg programs produced by Sawtooth Software are quite popular among marketing researchers and, essentially, introduced hierarchical Bayes to the marketing research community. They have been around for nearly 20 years. More recent versions I don’t have offer jackknifing, and there have been other enhancements. I’m not sure how well-known Sawtooth is outside of marketing research, though, and you might not have heard of them.

About 10 years ago they allowed the user to specify covariates at the upper level of the model to reduce shrinkage to the mean. So the user would not specify different priors for men and women or respondents aged 20s, 30s, 40s, for example, but the covariates in theory at least help us better account for respondent heterogeneity.

More importantly, equations for each individual case (e.g., respondent in a survey) are saved in CSV. They are the means of the draws for each respondent at the lower level of the model. These “equations” can be merged with the original data file for segmentation with cluster analysis or post hoc cross tabs by different kinds of respondents, e.g., income group or purchase frequency.

Sawtooth specializes in what they have dubbed choice-based conjoint, which is an extension of McFadden’s discrete choice model and earlier ratings based conjoint. For your reference, I’ve attached a brief article on “conjoint” as it’s usually called. The forecasts are really what if? simulations in which various product features are varied (e.g., price) to estimate what the impact on share of preference would be.

I was curious if you have heard of anyone using Stan in these ways.

My reply: I’ve never heard of this particular software called Sawtooth. In Stan, there’s this Prophet package for forecasting, developed by Sean Taylor when he was at Facebook but freely available. See also this description I found on the web.

More generally, if people can use Stan to fit more flexible models, maybe starting with something like Prophet that has existing models and then using this as a springboard to building their own custom models, that would be great. We’re also fine with Stan being used within commercial software. Stan and CmdStan have the business-friendly BSD license.