The other day someone pointed me to this article by James Kaufman and Vlad Glǎveanu in a psychology journal which begins:
How does the current replication crisis, along with other recent psychological trends, affect scientific creativity? To answer this question, we consider current debates regarding replication through the lenses of creativity research and theory. Both scientific work and creativity require striking a balance between ideation and implementation and between freedom and constraints. However, current debates about replication and some of the emerging guidelines stemming from them threaten this balance and run the risk of stifling innovation.
This claim is situated in the context of a fight in psychology between the traditionalists (who want published work to stand untouched and respected for as long as possible) and replicators (who typically don’t trust a claim until it is reproduced by an outside lab).
Rather than get into this debate right here, I’d like to step back and consider the proposal of Kaufman and Glǎveanu on its own merits.
I’m 100% with them on reducing barriers to creativity, and I think that journals in psychology and elsewhere should start by not requiring “p less than 0.05” to publish things.
Nothing is stopping researchers such as the authors of the above paper from publishing their work without replication. So I’m not quite sure what they’re complaining about. They don’t like that various third parties are demanding they replicate their work, but why can’t they ignore these demands.
Indeed, as I wrote above, I think the barriers to publication should be lowered, not raised. And if an Association for Psychological Science journal doesn’t want to publish your article (perhaps because you don’t have personal connections with the editors; see P.S. below), then you can publish it in some other journal.
If, you flip a coin 6 times and get four heads, and you’d like to count that as evidence for precognition or telekinesis, and publish that somewhere, then go for it.
As long as you clearly and openly present your data, evidence, and argument, it seems fine to me to publish whatever you’ve got. And if others care enough, they can do their own replications. Not your job, no problem.
What strikes me is that the authors of the above article, and other people who present similar anti-replication arguments, are not merely saying they want the freedom to be creative. They, and their colleagues and students, already have that freedom. And they already have the freedom to publish un-preregistered, un-replicated work in top journals; they do it all the time.
So what’s the problem?
It seems that what these people really are pushing for is the suppression of criticism. It’s not that they want to publish in Psychological Science (which, in its online version, could theoretically publish unlimited numbers of papers); it’s that they don’t want the rest of us publishing there.
It’s all about status (and money and jobs and fame). Publishing in Psychological Science and PNAS has value because these journals reject a lot of papers. They’re yammering on about creativity—but nothing’s getting in the way of them being creative and conducting and publishing their unreplicated studies. No, what they want is to be able to: (a) perform these unreplicated studies, (b) publish them as is, (c) get tenure, media exposure, etc., and (d) deny legitimacy to criticism from outside. They key steps are (c) and (d), and for these they need to play gatekeeper, to maintain scarcity by preserving their private journals such as Psychological Science and PNAS for themselves and their friends, and to shout down dissenting voices from inside and outside their profession.
Innovation is not being “stifled.” What’s being stifled is their ability to have their shaky work celebrated without question within academia and the news media, their ability to dole out awards and jobs to their friends, etc.
Freedom of speech means freedom of speech. It does not mean freedom from criticism.
P.S. I have no idea how much reviewing happened on the above-linked paper before it was published. Here’s what it says at the end of the article:
And here’s something that the first author of the article posted on the internet recently:
This last bit is interesting as it suggests that Kaufman does not understand the Javert paradox. He’s criticizing people who “devote their time” to criticism, without recognizing that, in the real world, if you care about something and want it to be understood, you have to “devote time” to it. In the particular case under discussion, people criticized Sternberg’s policies quietly, and Sternberg responded by brushing the criticism aside. Then the critics followed up with more criticism. Sure, they could’ve just given up, but they didn’t, because they thought the topic was important.
Flip it around. Why did Kaufman and Glǎveanu write the above-linked article? It’s because they think psychology is important—important enough that they want to stop the implementation of policies that they think will slow down research in the field. Fair enough. One might disagree with them, but we can all respect the larger goal, and we can all respect that these authors think the larger goal is important enough, that they’ll devote time to it. Similarly, people who have criticized Sternberg’s policy of filling up journals with papers by himself and his friends, and suppressing dissent, have done so because they too feel that psychology is important—important enough that they want to stop the implementation of policies that they think will slow down research in the field. It’s the same damn thing. Susan Matthews put it well: We need to normalize the pursuit of accuracy as a good-intentioned piece of the scientific puzzle.
P.P.S. I think I see another problem. In the reference list to the above-linked paper, I see this book:
Kaufman A. B., Kaufman J. C. (Eds.). (2017). Pseudoscience: The conspiracy against science. Cambridge, MA: MIT Press.
P.P.P.S. Just to clarify my recommendation to “publish everything”: I do think reviewing is valuable. I just think it should be done after publication. Put everything on Arxiv-like servers, then “journals” can do the review process, where the positive outcome of a review is “endorsement,” not “publication.” Post-publication reviewers can even ask for changes as a condition of endorsement, in the same way that journals currently ask for changes as a condition of publication.
The advantages of publishing first, reviewing later, are: (a) papers aren’t sitting in limbo for years during the review process, and (b) post-publication review can concentrate on the most important papers, rather than, as now, so much of the effort going into reading and reviewing papers that just about no one will ever read.
For more, see:
An efficiency argument for post-publication review
and
Quote from the blogpost: “It’s all about status (and money and jobs and fame).”
I have heard that professors possible promotion, and extra money, may rely partly on them publishing in “official”, and the “best” journals.
I have heard that universities may get a part of the money coming in from grants (for “reasons”).
If these things are (sometimes) correct, i find it odd, and incomprehensible, that these things seem to not be talked about much in all these “let’s improve things” and “let’s change the incentives” (whatever that means) talks.
Now you can make your university rich, receive a lot of media attention, and possibly getting a promotion, by wasting tons of other people’s money via large-scale “collaborative” efforts, or being a “director” of some sort of “center”. But is that truly different, or an improvement, compared to making your university rich, receiving a lot of media attention, and possibly getting a promotion, by wasting tons of other people’s money via “counter-intuitive” and “sexy” findings?
Why am i not hearing about proposals on how to improve matters that concern power, money, and influence? I think i read something on this blog about handing out relatively smaller grants to relatively more people. I think these type of possible improvements should be talked about much more.
Also see “Excellence by nonsense: The competition for publications in modern science” by Binswanger (2014):
https://link.springer.com/chapter/10.1007/978-3-319-00026-8_3
“(…) and “let’s change the incentives” (whatever that means) talks.”
I think that when someone mentions “the incentives” in a discussion about improving science, it could sometimes be useful to refer to the (what i think is an excellent, funny, and possibly useful) blogpost by Tal Yarkoni called “No, it’s not The Incentives – it’s you”:
https://www.talyarkoni.org/blog/2018/10/02/no-its-not-the-incentives-its-you/
I don’t find that blog post very interesting at all. Here’s the thing. suppose you’re a person who doesn’t like fucking around with “sexy” findings to get funding and wants to do real science. Suppose also that you somehow get yourself a job given “the incentives” then either you change to match the incentives, or within a couple years you lose your job and move on. Having seen this, people like you who are earlier in their career decide to move on before even getting a job. Soon, the entire queue of people waiting for research jobs is full of people who *like* fucking around and getting huge respect for it.
So, yeah, “it’s you” and “all the other people who are trying to get a job in a system where you have to do fake research on sexy findings and come up with false reasons why your research is going to cure cancer or revolutionize electrical power generation within a decade or solve the problem of homelessness with a wristwatch tracking device or whatever garbage”
So, yeah, it’s the incentives, the incentives create an industry of fakers because the incentives reward faking.
Basically your whole first post is “the incentives” such as that universities take 50-60% additional on top of whatever the granting agency gives the PI, and so the universities hire people who easily get lots of grants, and the grants are given to people who publish lots of sexy research in PNAS and Nature and Science and soforth, and the publications are given to people who are “in the in-crowd” who do the kinds of research that is sexy and amazing… and often bullshit…
so, talking about “it’s you” is like walking into wall street and saying “y’all are a bunch of misanthropes whose whole job is to game the system” and the room is full of Martin Shkreli clones who are nodding along and smiling and thinking “dope!”
Quote from above: “So, yeah, it’s the incentives, the incentives create an industry of fakers because the incentives reward faking.”
I think the system possibly rewards money, power, influence, etc. You can (could?) get those by faking, or doing sloppy, science, but that’s possibly not the crucial thing.
I am now worried that entire new things are being done, and proposed, that can get you money, power, and influence and may be just as bad for science as “faking” and “sexy papers”. Part of my point is that i think it could be important and useful to really think about what could work to improve matters, and why and how. My comment above alludes to that.
Quote from above: “so, talking about “it’s you” is like walking into wall street and saying “y’all are a bunch of misanthropes whose whole job is to game the system” and the room is full of Martin Shkreli clones who are nodding along and smiling and thinking “dope!””
Exactly! I reason that the majority of people currently working at universities are possibly not very good, and to aspire to, scientists. At least not for me.
I don’t want to ask for tons of funding knowing part of it go directly to the university for “reasons”.
I don’t want to make publishers rich by playing the “let’s see if i can get my paper in this top-journal” game.
I don’t want to make my university rich by “educating” young people for a job that probably doesn’t even exist for them because there are too few jobs relative to the number of people that are being “educated” for them.
For my 1st paper i did not play the “lets send it to a top journal” game because i reasoned it doesn’t matter where it’s published as long as people can find it. For my 1st paper i did not listen to the reviewer who told me to “leave the additional analyses out because they didn’t add anything”, nor did i listen to my professor who said “the reviewer is always right”.
(Un)luckily i never got a job after graduation so i didn’t participate in the system anymore. Due to the spare time i had from being unemployed, and other circumstances, i learned in the following years that i don’t ever want to work at a university anymore. And i don’t ever want to publish in an “official” journal anymore.
Regardless of all of the above:
1) Do you agree with the idea that the “real” problematic issues in current day science might be related to money, power, influence, etc.?
2) And if yes, do you have any ideas about how to “really” tackle these issues? (i gave one possibility concerning handing out smaller grants to more people)
I think somehow you miss that “money power and influence” *are* the incentives.
That is, whatever gives you money, power, and influence is the thing that many people will want to do, and maybe more important the things that will keep you employed.
But, I think if you get money power and influence by actually curing diseases, improving public health, finding policies that decrease wealth inequality and increase total GDP simultaneously, discover new ways to eliminate pollution from the oceans, reduce the environmental impact of humans on wild areas while simultaneously increasing the quality of life for individuals in rural areas… etc then *that’s a fine thing*. But if you get money power and influence by publishing a large number of papers in “top” journals *independent of the importance and correctness and good things that come from that research* then you will find lots of “hot” research being done, because solving the worlds problems isn’t paying the bills.
How to *really* tackle these issues… Yes I guess we’ve had this kind of discussion a lot here on the blog, you might search this blog for some of my previous comments. Right now I have to do some other things, but maybe can come back to this topic friday or so.
Here’s a thread from 3 years back: https://statmodeling.stat.columbia.edu/2016/06/01/the-natural-selection-of-bad-science/#comment-276806
1) “I think somehow you miss that “money power and influence” *are* the incentives.”
Yes, i get that (i think). That is exactly what i am trying to make clear. I am trying to say that i think recent discussions about “the incentives”, and how to possibly improve matters, are not mentioning money, power, and influence (i.c. the possible real incentives). Instead they seem to only talk about “sexy papers” and “publish or perish”, and stop at that “level”.
They seem to leave out what i think is the most important thing, and “level”: the “sexy papers” are possibly only useful to in turn get you money, power, and influence.
When you focus on the “sexy papers” and “publish or perish” in discussions and proposed improvements, you are possibly not tackling the real issue (i.c. money, power, and influence). The narrative surrounding “the incentives” in my experience has always been mostly about how researchers need to get published in the “best” journals, blablabla, and how a long list of publications gets you tenure, blablabla.
In my experience, those are “the incentives” (i.c. long list of publications, publishing in the “best” journals) that are talked about, and brought forward to explain why researchers use questionable research practices and all that stuff. However, i reason they may only be secondary incentives. The real incentives like money, power, and influence are never really brought into all of this, but should possibly be the things you should be talking about, and want to change. That’s the point i am trying to make!
If you are only discussing things on the “level” of “sexy papers” and “publish or perish” (as i think has been happening) you might just be substituting “sexy papers” with something that is “not a sexy paper”, but that doesn’t mean it’s good for (improving) science. I reason you need to get rid of (the possible negative effects of) money, power, and influence in science as much as possible.
2) “Here’s a thread from 3 years back: https://statmodeling.stat.columbia.edu/2016/06/01/the-natural-selection-of-bad-science/#comment-276806”
Ah, thanks. I am/was the “Anonymous baker” coincidentially in that discussion :)
I like the idea of doing something with grants (money) as i reason that’s one of the things (“levels”) you should be working on to fix the possible problems in science like i tried to make clear above. I am not sure i agree with randomisation, although i can see it might be better than what is currently happening.
Anyway, i have been thinking about this for a only a very short while, but all i can think about that makes sense is the following:
# Hand out smaller grants to everyone.
# Stop letting universities take a share of the grant money (for “reasons”)
# Perhaps the individual researchers/labs can join forces and collaborate if they wish but that is their own choice, which will hopefully be made on the basis of scientific reasons.
# To make sure that scientific reasons are used thinking about possibly joining forces, i reason it’s important to subsequently make sure joining forces, or not joining forces, will not influence future grant receving.
# I also think it’s important to make sure joining forces, or not joining forces, will not influence publication.
It’s you because of the incentives. The incentives helped to select you. The world is full of people, you will always find someone that fits the incentives in place. So, if you think the problem with science is the current version of who “scientists” are, in order to attract a different kind of people to science you will need to change the incentives.
Yes, that puts it very succinctly. Thanks
“It’s you because of the incentives. The incentives helped to select you. The world is full of people, you will always find someone that fits the incentives in place. So, if you think the problem with science is the current version of who “scientists” are, in order to attract a different kind of people to science you will need to change the incentives.”
The dictionary defines “incentive” as “something that encourages a person to do something”.
Let’s say scientist A is all about doing actual, real, and good science (that’s A’s “incentive” or “encouragement to do something”), and scientist B is all about getting money, power, and influence (that’s B’s “incentive” or “encouragement to do something”).
Let’s say we want scientist A, and not scientist B, working in science.
Exactly which “incentives” need to be changed in order to achieve that (and how, and why)?
> Let’s say scientist A is all about doing actual, real, and good science (that’s A’s “incentive” or “encouragement to do something”),
No, that’s scientist A’s *reason* for doing something… An incentive would be an external factor that rewards doing that thing which makes it more likely for people to continue doing it. So for example if Big Medical Foundation reads the work of scientist A and gives them a grant to continue doing it, this is an incentive, if Big Medical Foundation reads their grant application and tells them “this is too risky, will take too long, and isn’t of interest to us, no money for you” that is a *disincentive* to continue that work.
How can we adjust incentives in science to align with what we want to accomplish? This is a major question. The general field is called “Mechanism Design” in economics https://en.wikipedia.org/wiki/Mechanism_design
It could be worthwhile to consider the literature on mechanism design and actually think about what a good mechanism would be. One first step is to ask what things we actually consider good. What would ideal science look like?
“The dictionary defines “incentive” as “something that encourages a person to do something”. ”
Another definition of “incentive” i came across reads: “a thing that motivates or encourages someone to do something” (synonyms that are mentioned include “motive” and “reason”)
I can still remember me writing about how i “wanted to (positively) contribute to science” as my ultimate goal in my obligatory (and in my opinion pretty useless) self-reflection part of the work i had to hand in concerning my graduation.
I think “wanting to (positively) contribute to science” has been my incentive, and motive, and reason for doing things all this time. I think this has:
# influenced me to conduct my research in the best way i knew how to,
# influenced me to not play the “let’s try and get my paper published in the “best” journal”-game,
# influenced me to not listen to reviewers when they wanted me to do things that i thought were sub-optimal from a scientific perspective,
# influenced me to participate in several “open science” and “let’s improve” things, and
# influenced me to participate for years on this forum, and others.
If this can be seen as “good” for (and in line with) science, why the F#CK should “the incentives” be changed?
There is nothing wrong with my, and a lot of other’s, “incentives” i reason…
The opportunities for getting money, power and influence within the scientific system without paying attention to actual, real and good science need to be drastically reduced. I tend to read several of the proposals advanced in this blog (open access to data and code, pre-registration, encouragement of criticism in publications, sharper and faster retraction mechanisms, and so on and so forth) as leaning in this direction.
Quote from above: “The opportunities for getting money, power and influence within the scientific system without paying attention to actual, real and good science need to be drastically reduced. I tend to read several of the proposals advanced in this blog (open access to data and code, pre-registration, encouragement of criticism in publications, sharper and faster retraction mechanisms, and so on and so forth) as leaning in this direction.”
I agree that “the opportunities for getting money, power, and influence within the scientific system without paying attention to actual, real, and good science need to be drastically removed”, but i reason this is at best only half the work.
If money, power, and influence still are in effect, problematic issues can still arise. If i want to get tenure i could come up with lots of ideas to ask for lots of money, so i am making my university famous and richer, and increasing my chances of getting tenure, or a salary raise, etc.
If the ideas i come up with, that cost a lot of money or give power to a small group of people, use open data, code, pre-registration, etc. this is not necessarily a reason for supporting these things, nor is it necessarily really an “improvement” in my reasoning (as i tried to make clear somewhere above in the comments).
I think the point is that it’s much easier to keep talking about the Emperor’s New Clothes when everyone only gets to see carefully Photoshopped professional press-release versions of the clothes. As soon as everyone gets to look directly at the Emperor they can see that he ain’t wearing nothing…
So open data, open code = direct access to the clothes reduces the chances that people can gain influence based on BS.
Preregistration is in my opinion just a hack to make p values not so terribly stupid. Rather than pre-registration I’d rather have open data, open code, and post-data argumentation: why should I believe that your Bayesian model makes sense and does it really fit to the posterior distribution you published (I can verify this with open data and code) and are there alternative models that might also fit well (I can publish my own if the data is open).
Science as received wisdom for the masses from the high priests and priestesses of the guild is a major part of what’s non-scientific… that priests and priestesses have been venerated throughout history is part of the incentive structure: power and influence for those who hide behind secret data and anonymous peer review.
“MOTIVE implies an emotion or desire operating on the will and causing it to act. […] INCENTIVE applies to an external influence (such as an expected reward) inciting to action.”
Quote from above: ““MOTIVE implies an emotion or desire operating on the will and causing it to act. […] INCENTIVE applies to an external influence (such as an expected reward) inciting to action.”
I reason an external influence is but one type of “incentive”. This is what i now think has become much clearer as a result of this discussion. If this makes sense, i reason talking about “the incentives” in recent discussions is sub-optimal for 3 reasons:
1) they seem to me to almost always assume there are only external “incentives”
2) they seem to me to almost always assume all these external “incentives” are all there is (i.c. if an “incentive” is an external influence (such as an expected reward) inciting to action, does this not also involve the “motive” to adhere to/follow these external incentives?)
3) they seem to me to almost never talk about the possible actual things that are possbily crucial concerning these “incentives”.
Perhaps i can try and make my point clearer via a cat, which i think fits nicely with the appearance of the occasional cat picture on this blog.
You can get a cat to look at, and follow, those red laser pointers that you can point to the floor. Let’s say the cat is the scientist, the red laser pointer the “incentive”, and shining the red laser dot in a certain “good science is done in here”-hallway is “nudging” the scientist to do good science.
I think the cat should not be following the red laser dot, nor should anyone want to shine the red laser dot in a certain hallway because they assume, or think, or believe, that is the “good science is done in here”-hallway. Who is the person behind the red laser pointer to decide which is the best “good science is done in here”-hallway? Who says the person behind the red laser pointer is not shining the red dot into the wrong hallway?
I reason the cat should go in the “good science is done in here”-hallway by their own choice, not by following a red laser dot on the floor. I reason the cat should not be following the red dot light, but follow their inner light.
We shouldn’t even talk about “incentives”, red laser dots, or lights. We should be talking about what, why and how to perform good science, and scientists. Talking about “the incentives” without really thinking about them, and without explicitly mentioning what they are, will probably (and has already in my opinion) at best only lead to sub-optimal discussions and/or proposed “solutions”.
1) Also see here https://en.wikipedia.org/wiki/Incentive
“An incentive is a contingent motivator.[1] Traditional incentives are extrinsic motivators which reward actions to yield a desired outcome. The effectiveness of traditional incentives has changed as the needs of Western society have evolved. While the traditional incentive model is effective when there is a defined procedure and goal for a task, Western society started to require a higher volume of critical thinkers, so the traditional model became less effective.[1] Institutions are now following a trend in implementing strategies that rely on intrinsic motivations rather than the extrinsic motivations that the traditional incentives foster.”
2) And also see this quote from the Binswanger (2014) paper “Excellence by nonsense: The competition for publications in modern science”
“Carrots and sticks replace the taste for science (Merton 1973) which is indispensable for scientific progress. A scientist who does not truly love his work will never be a great scientist. Yet exactly those scientists who are intrinsically motivated are the ones whose motivation is usually crowded out the most. They are often rather unconventional people who do not perform well in standardized competitions, and they do not feel like constantly being forced to work just to attain high scores. Therefore, a lot of potentially highly valuable research is crowded out along with intrinsic motivation as well.”
Quote from above: “MOTIVE implies an emotion or desire operating on the will and causing it to act. […] INCENTIVE applies to an external influence (such as an expected reward) inciting to action.”
As an aside: if i follow your Merriam-Webster name+link i read the following 2 definitions:
1) Motive: “something (such as a need or desire) that causes a person to act”
I interpret that this can be a lot of things, including things like emotions, or desires, or even external influences.
2) Incentive: “something that incites or has a tendency to incite to determination or action”
I (still) interpret that this can be a lot of things, including things like emotions, or desires, or even external influences.
Quote from above: “Talking about “the incentives” without really thinking about them, and without explicitly mentioning what they are, will probably (and has already in my opinion) at best only lead to sub-optimal discussions and/or proposed “solutions”.”
Now i am all riled up!
Another 2 things that annoy me concerning all this “incentives”-talk of the past years:
1) I sometimes hear folks talk about how can we “incentivize XYZ”. “XYZ” is then something they probably think, assume, feel, is “good” for science, but this is never really further (critically) discussed. It is just assumed, and agreed upon, that “XYZ” is “good” for science.
It seems to me that whenever the word “incentive” is somehow brought into it, you can then propose just about anything. It sometimes seems to me that as long as you are talking about “the incentives” that’s all that matters. Whether or not something actually might be good for (improving) science sometimes seems to be of less importance than using words like “incentives”…
2) All this talk about “the incentives” also can go hand in hand with one of my other annoyances: a possible “special” status of, and over-reliance on, “meta-scientific” research.
I sometimes get the idea that meta-scientific research is somehow seen as super important, totally objective, and/or the highest form of research or something like that. I reason, and assume, meta-scientific research can be troubled by the same things as “regular” research.
I reason meta-scientific research into “the incentives” can easily lead to steering science down the wrong path. If you place a lot of value on meta-scientific results, and not really think about matters, you could easily come up with all kinds of “solutions” and “improvements” that show (short-term) positive effects on science, but may still not be a very good idea.
(As an aside: can anybody point me to any actual research, and data, concerning “the incentives”? Is there any data on how much grant money possibly goes to the university for “reasons”? Are there any papers with data about which “incentives” play a role in receving tenure? Aren’t these things important to investigate before coming up with all kinds of “solutions” that influence “the incentives”?)
“(…) i reason this is at best only half the work”
Agreed. I am one of these persons who consider half the work is infinite per cent more than nothing.
“If money, power, and influence still are in effect, problematic issues can still arise”.
Money, power and influence will always be in effect, because one important feature of “actual, real and good science” is that it works, so it is only natural that may bring you money, power and influence. Problematic issues can always arise, but the thing about open data etc. (IMHO) is that with these methods they will be more easily detected and identified. And that’s half the way to solving them.
Also the blog post focuses on cases of outright fraud, like Diedrick Stapel, doing things like data fabrication, as opposed to more “passive fraud” like just failing to do science and instead going with the flow of traditional shitty research practices that “everyone else does too” like interpreting stars in your regression as Oracles of Truth. It seems to miss the point that some huge percentage of what “scientists” are doing today is *not science*. It’s not just a few outliers who succumbed…
It also seems naive, like he gives the example of the doctor prescribing lots of tests and how “you’d be livid”. Well, I AM LIVID every time I go to a doctor’s office and they want to prescribe a $500/mo steroid inhaler that’s hardly more effective than the inhalers from 20 years ago but costs 100x as much and comes with special coupons to offset the out of pocket cost so the patient will happily go along with it all while the insurance companies get milked and then pass that back to the consumer in the form of premiums that no-one can afford, and soon it’s a routine process for people to break their ankle on a skateboard and go bankrupt and become homeless because they didn’t have insurance and their financial situation was precarious….
so yeah, again, it’s the incentives, and NO doctors may not be happy about it, but they do it *every damn day of their lives* and I’ve had them tell me “I just close my eyes and don’t think about it because if I did I wouldn’t be able to practice medicine”. so yeah, people who want to help us with our medical ailments *are* working for “the system” and it’s not just the occasional guys who are in the news who go overboard and actively deal heroin in pill form to black markets or whatever.
If you want things to improve *you have to fight the system that preserves the current equilibrium* it’s no good to put it all on bad actors or just try to walk away. (though I note that apparently plenty of people have decided to start walking away from the US to avoid “the system” of student loans: https://www.cnbc.com/2019/05/25/they-fled-the-country-to-escape-their-student-debt.html )
It would appear to me that incentives are a huge part of this larger topic.
https://statmodeling.stat.columbia.edu/2019/04/12/several-reviews-of-deborah-mayos-new-book-statistical-inference-as-severe-testing-how-to-get-beyond-the-statistics-wars/#comment-1014479
I think Art Owen mentioned something about incentives (in terms of misuse of statistical methods) in the last section here: https://statmodeling.stat.columbia.edu/wp-content/uploads/2019/03/review.pdf
Publication, replication, misuse of statistical methods – are these all topics relating to the same system? Perhaps one with some problems.
A sort of funny joke from my days as an anthropology graduate student:
Q: Why do academics argue so much?
A: Because the stakes are so low!
Thanks!
wasn’t this a (the?) major theme of Richard Russo’s Straight Main?
In fairness, the text of the linked article is more about the idea that forcing any scientific product to fit into a specific mould would be detrimental, a sentiment with which I and, I suspect, most people would agree. The problem is that the authors do not realize (or at least do not acknowledge) that they have already been operating in a system in which this is true.
In their system—the one they seem to be fighting to preserve—you take a hypothesis that is vaguely related to something someone important in the field has said, obtain a tangential prediction from that hypothesis, and publish it when a thoughtlessly applied statistical test has an asterisk next to it. If you don’t follow these guidelines, you will not garner the social support of your superiors (you are not “part of their team”) and you will have to spend your time justifying why you did something different rather than explaining why what you did is important (i.e., “why did you use a Bayes factor” instead of “what does the result mean”). You can have a career, but it will not be one filled with the social and material rewards that could be obtained if you just “played by the rules”. In their system, the rules are still there, they are just unspoken.
I do think it is worthwhile to consider that some proposed remedies to the replication crisis could stifle innovation if they force all science to stick to a short list of pre-approved methods. But it is equally important to remember that this is already how science operates—it’s just that those “pre-approved methods” don’t do much to advance knowledge.
Kaufman wrote:
“other people’s work […] will be long dead forgotten footnotes when Sternberg’s work is still read.”
Sternberg wrote:
“Many professors and students, not only when I was in graduate school, but also throughout the world have built their careers on practices once considered both creative and perfectly legitimate but that today might be viewed as dubious. What this fact highlights is that scientific creativity—indeed, any form of creativity—can be understood only in context (Csikszentmihalyi, 1988, 2013; Plucker, 2017; Simonton, 1994, 2004; Sternberg, 2018).”
Raise your hand if you think Sternberg’s works will be read 20 years from now.
Matt:
I dunno, people are still reading Nostradamus, right?
What about Chariots of the Gods? Are people still reading that? 40 years ago, the two books you were most likely to see at random were that and Alive! (you know, that book about that South American soccer team with the crashed plane). But some organization seems to have scoured the world for all extant copies of those books and pulped them. Along with the 90 million Perry Mason books.
I used to come here for enlightenment. Now I just show up to wait for the von Daniken references. It’s a good day.
“(…) Alive! (you know, that book about that South American soccer team with the crashed plane).”
I don’t like reading, but i’ve seen the movie (https://www.imdb.com/title/tt0106246/) and found it very impressive.
I’ve also seen a documentary about it, with interviews from the actual people involved. The footage of the men who went down the mountain and found help is incredible to watch for me as well. I think that documentary also showed some footage of a journalist asking them what they had eaten to stay alive all that time, or something like that, and the look on the faces of the men after hearing that question…
I find the movie/story/documentary inspirational, and tremendously impressive to this day.
“What about Chariots of the Gods?”
Well, you can actually read Chariots of the Gods. Disillusioned Psychologist said it best below:
“Because psychology’s theories are weak, its findings equivocal, and the field very susceptible to hype, it produces little that endures and the wheel is reinvented constantly.”
Remember that theory about the brain in which it was determined that we have a mammalian brain in the middle, surrounded by an ape brain, surrounded by human brain? An entire psychology literature arose. Not getting many citations these days.
Here is my analogy:
If you are writing papers in physics or chemistry, you are an ant carrying a grain of sand to the top of a hill, buttressed by the grains others have carried. If you are writing a paper in psychology, you are a spider attempting to extend a filament of web off into the ether, supported by a tenuous bond to the surrounding web and extending to who knows where. Periodically the entire web gets ripped down.
These days the new motto should be ‘you will perish if you publish’. lol, There is truth to that as some academics have pointed out that there are simply far too many researchers and papers. I have noticed that scientists themselves have sought out highly eclectic creatives as Serge Lang and his circles in the National Academy of Sciences have. But not simply Serge and his circles but at institutions like Harvard and MIT. Exceptional creativity may not even the proper term. Fluid and crystallized intelligence better markers if they can be measured. I though find their measurement somewhat tedious and enough to lead even the most creative into being way less creative. It takes a lot of time to cultivate quality anything, notwithstanding luck, chance, and opportunity.
Hold it, you’re not criticizing Nostradamus, are you? How do you explain how he predicted Hitler and the Raptors taking the East?
Kaufman was Sternberg’s student so it’s not entirely surprising that Kaufman would exhibit a sort of loyalty to him (and further indicates how problematic Sternberg’s role as editor for the article might have been). I cannot speaker for what it is like in other disciplines but in psychology individuals’ theories and reputations are kept afloat but their students, their students’ students, etc. Sternberg has had many students and many have been quite successful, so some of his work will likely be cited 20 years from now (whether it is read with any degree of care is another matter). Academic psychologists generally have short memories (just look at the average date of the entries in a References section) so once this sort of reverse nepotism erodes Sternberg’s work will likely cease to be frequently referenced; Hans Eysenck wrote over 1,000 articles, wrote tens of books, and had one of the preeminent models of personality but he died in 1997 and his work is given relatively little in-depth attention now. Because psychology’s theories are weak, its findings equivocal, and the field very susceptible to hype, it produces little that endures and the wheel is reinvented constantly (see Table 1 in Greenwald [2012] for a list of major theoretical controversies that psychological research has failed to solve). A “replication crisis” was referenced in the 1980s and here is the first line from Bakan (1966): “That which we might identify as the ‘crisis of psychology’ is closely related to what Hogben (1958) has called the ‘crisis in statistical theory’.” In that same paper Bakan cites multiple studies going back decades that document problems with significance testing, yet almost 60 years later the field is still struggling with how to deal the problem. I am not sure if a definitive resolution will occur even now because journal editors’ avowed devotion to replication and reducing overreliance on statistical testing may be little more than lip service. I cannot say how representative my experience is, but I was the co-author of a study submitted to a “top” journal that was close to a direct replication of a previous study we’d done, and we devoted quite a bit of space to why the study was important because of this. In the R&R comments none of the reviewers or the editor mentioned the fact that it was a replication – nor did they have a problem with the fact that dozens of significance tests were performed (without corrections for multiple comparisons), even though the journal guidelines specifically said the confidence intervals should be used, not significance tests. In the version of the paper eventually accepted and published the dozens of tests remains, replication is mentioned twice, and there are no confidence intervals.
> I cannot speaker for what it is like in other disciplines but in psychology individuals’ theories and reputations are kept afloat but their students, their students’ students, etc. Sternberg has had many students and many have been quite successful, so some of his work will likely be cited 20 years from now (whether it is read with any degree of care is another matter).
I can vouch for the exact thing happening in natural language processing. I see work that was once popular and I believe is now only being cited because of dogged professors with grad students who don’t get out much. It’s much harder, but I’ve even seen this done in top departments—it’s not just out on the fringes.
This makes some kind of sense, because of two factors, tenure and focus. Once you get tenure, there’s not nearly the same incentive to keep up. Departments that might get rid of that quirky research if they could have faculty that get grants that fund students, so the students keep coming out.
There’s also intense pressure in academia to stay focused in a single area and go deeper. Breadth is rarely rewarded directly.
If you look at the topic of half of my thesis and my first book (logic programming, and particularly typed logic programming with struct-like data structures), it’s pretty much dead (there was a great paper by Dan Jurafsky and students on time-series for natural language topics and that one peaked around the time of my thesis). My book may have helped kill the field as I consolidated a bunch of results, but at the time, at least half the people in the field thought I was on the wrong track and proving the wrong theorems (I was focused on type systems with linear inference; let’s just say that efficiency wasn’t a primary concern of the literature). But mostly it was statistics that took over the field, pushing logic and semantics to the side in favor of building statistical classifiers, taggers, and parsers.
OK, here’s a tangent: I read Alive and thought it unknowingly made a very powerful point about decision theory, that you always have to balance the risks of action against the risks of inaction. The plane was stuck in snow on a slope that led down to a valley that was partially inhabited. Yes, the immediate survivors could not see this, and sending a party down the slope seemed very dangerous (which it was), so they delayed for months. Meanwhile, without thinking explicitly about it, they accepted the risks of staying put, which included the obvious one of an avalanche (how can you not know this about snowy mountain slopes?), which in fact transpired, killing a large fraction of those who had survived the initial crash. In retrospect, once it was obvious they would not be rescued by being spotted from the air, they should have sent a party down to the valley, and it is probable many lives would have been saved. The whole cannibalism thing is a distraction, IMO.
+1 for decision theory.
I think the article shows that Kaufmann also does not understand the importance of the file-drawer problem or the time reversal heuristic. The article emphasizes aspects of creativity and innovation, but does not seem to acknowledge that in some cases the first published evidence for an idea or pattern (“creativity”) is just the first such study to come up with p<.05, and that previous studies of the same idea or pattern went unpublished. There isn't anything especially important or creative about that first published study (published because p<.05) relative to those in the file drawer, or relative to a subsequent replication study that shows a smaller or nonsignificant effect. Unless we know what's in the file drawer, or until a successful replication study is published, it seems one should reserve judgement on the creativity or innovation of a singular result. The article implies that this creativity can be assessed without that context, but after reading the article I still did not understand how that assessment could be made. Maybe I just don't understand what Kaufmann means by creativity or innovation.
Mike:
I think Kaufmann likes the status quo and is angry at people who might want to change things. On the plus side, writing all these ridiculous things has made him famous for a day, at least among readers of this blog. Otherwise he’d just be a “forgotten footnote.”
Ha, yes maybe it’s just conservatism and preference for status (and status quo). But from that point of view his emphasis on creativity and innovation is even harder to understand.
Mike:
I don’t think it’s just conservatism and preference for status and quo. I agree that he also seems to be misunderstanding research methods, so that he thinks that telling stories based on random numbers is a good path toward scientific discovery and understanding.
A couple of comments somewhat related to the topic, but not related to each other:
I. Andrew said, “I’m 100% with them on reducing barriers to creativity”
This raises the problem that “creativity” can mean quite different things to different people. In particular, sometimes some people equate “creativity” with “license to do or claim whatever you want”. I personally consider this to be pseudo creativity. So we need to be careful when saying things like “reducing barriers to creativity” that we are talking about the same thing. So it might be helpful if Andrew and/or others gave a working definition of “creativity”.
II. Mathematician Christina Sormani of CUNY has recently started a discussion about possible gender bias in the process of peer review in mathematics ( https://groups.google.com/forum/#!msg/womeninmath/9zO0bZuvBpo/FQ9ZidM0BgAJ ). She writes,
“I am posting some thoughts arising from discussions with various women mathematicians. We are concerned on behalf of junior women. There is a complete lack of women among the editors of journals in mathematics. There are constant endless rejections of articles by women with mediocre to nonsensical justifications for the rejections. These articles are often later accepted at journals of equal caliber. Sometimes there are delays of many years before publication as rejections may only arrive after a year. For women in the early stages of their career these rejections and delays can bring an end to their career.
We need to call for significant reform.
Some possibilities we might consider:
1) decisions to reject over significance should be made within a couple months. Later decisions to reject should only be due to error.
2) obviously biased or condescending reports should not be permitted
3) referees should be publicly named
4) editors should be serve limited renewable terms
5) solicitations for new editors should be conducted without bias like any other job search: openly advertised with due consideration given to all applicants
6) journals that lack diversity among the editors should be reviewed for bias in the selection of editors
I am not yet sure how we should call for reform. Perhaps first should be to request that the mathematics organizations like the AMS establish certain standards. We might also consider contacting the publishers.”
Her claim that “There is a complete lack of women among the editors of journals in mathematics” sounded strong to me, so looking for some possible data on this, I found a 2016 study ( https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0161357 ) that states that “Women are known to comprise approximately 15% of tenure-stream faculty positions in doctoral-granting mathematical sciences departments in the United States. Compared to this group, we find that 8.9% of the 13067 editorships in our study are held by women.” This does make Sormani’s “complete lack” sound like an exaggeration, but the gap between representation of women in math and women on editorial boards of math journals does sound large enough to be of concern.
Once again, on a post titled “Let’s publish everything” I gotta say, we should really and truly publish *everything*:
I really think the Publicatorator is a thing that should happen: https://statmodeling.stat.columbia.edu/2019/05/26/i-have-zero-problem-with-people-reporting-results-they-found-with-p0-1-or-p0-2-whatever-the-problem-is-with-the-attitude-that-publication-should-imply-some-sort-of-certainty/#comment-1049571
Once we have that, problem 1 goes away, no rejection over “significance”. 2) goes away, because there are no pre-publication referees, 3) is how all the reviewing would be done: post publication, 4) editors become curators of lists, and work more like Siskel and Ebert rather than demigod gatekeepers of the one true publication.. 5) anyone can become an editor, just have something useful to say with a list of publications you think are worth discussing, 6) journals cease to exist and editors are chosen by their audiences.
So, really, it solves all the problems.
You’re simply pointing out that the current academic model where publications are the currency of careers (with more prestigious journals being bills of higher denomination) is a collective delusion. Just as currency itself is a collective delusion that we all agree to cooperate in. The Publicatorator is, by crude analogy, the cryptocurrency alternative to the central bank backed cash-money. But it is still backed by some weird collective belief system that rather than being reliant on the blessing of the establish elite seems to be heavily reliant on the much-hyped (and often underwhelming) wisdom of the crowd.
I’m not sure that switching belief systems will automatically result in a more just, or more principled, or just more efficient model for the advancement of science (or the advancement of scientific careers). My view is that unscrupulous people will very quickly figure out how to game the new system, because that’s what humans do. And to the extent that what you’re proposing is just some academic flavored version of Reddit, I’m pretty damned skeptical that’s going to solve the gender bias problem that Martha pointed out. (If you’ve spent more than thirty seconds on Reddit, I suspect you’ll tend to agree with my skepticism.)
To me, it’s all about getting the information out there for me to access. I don’t give a fig about promotion and tenure and crap like that. If the current industry of academia rots on the vine and drops off I’d be fine with it. It’s on the way there as the cost of a college education has risen faster than pretty much anything else we commonly consume, much faster even than medical costs. Depending on your major, the cost of an undergraduate degree already exceeds lifetime value, so people are moving to India to run away from their student loans: https://www.cnbc.com/2019/05/25/they-fled-the-country-to-escape-their-student-debt.html
I just want people who are finding things out to be able to communicate to each other effectively in a non-ephemeral way. If they are using government funds I want to force them to publish their findings and especially *data* on a non-ephemeral place where the rest of us can use it. The WWW is too ephemeral to play that role by itself.
Also while the crypto-currency analogy isn’t a terrible one, the current concept of crypto-currency simply doesn’t work because it relies on the transaction to be computationally hard and require considerable energy, so the cost of a bitcoin becomes on parity with the cost of the number of kWh of electricity its transaction requires. And that’s considerable. So inevitably the smallest thing you can transact by bitcoin / blockchain grows rapidly, to the point where transactions less than a few thousand dollars are out of the question etc.
That makes the blockchain concept fundamentally fail to be a viable independent non-central currency. Whereas the Publicatorator does the opposite it drops the cost of transacting a full PDF full of information down to the appropriate less than a tenth of a penny it should cost. Bitcoin is ultimately a scam, the Publicatorator on the other hand is an anti-scam, it’s the high transaction cost of articles that’s the scam.
Daniel:
Journals as recommender systems. This particular link is from 2017, but I’ve been saying this for a long time.
Great idea for publication.
Seems like there could be a problem though because people like to write papers but don’t like to read them, so there would be a lot more papers but not enough readers to ensure that better papers float to the top. Seems possible that good papers by obscure people could still be left in the dust bin.
Jim:
Sure. But I think the solution here is similar to what we have now, just that instead of submitting a paper to the journal for publication, you submit an already-published preprint to the recommender system to try to get a recommendation. At some point you need people to review these things, so it still puts a premium on giving your paper a grabby title, an exciting abstract, etc. No way around that, I’m afraid: If you want attention, you need to sell your work, connections matter, etc.
I shudder to think I might sound like an economist, but it seems like that problem could be solved by changing incentives. Rather than simply being judged on one’s publications, providing reviews and recommendation that are thoughtful and in-depth could be adopted as one of the criteria for what makes a good academic.
Dalton:
That’s fine, but we’ll have to change the scale. Millions of scientific papers are written each year, and there aren’t enough qualified reviewers to give each one of these a thoughtful and in-depth review, nor are there enough qualified editors to supervise the revision of each of these papers. The in-depth reviewing will need to be focused on the papers that are interesting and consequential enough to be worth the effort.
Dalton is sounding like an economist, but an unusually astute one. Scale raises some larger issues about context that seems missing from this discussion. Years ago I met an old mathematician (his career was spent at Stanford) and in my annual searches for new academic positions, he commented that I shouln’t have to apply for these positions – I should be nominated by the current faculty at those schools. That was how it worked in his day – and when interviewing, his work would be presented by someone on the faculty rather than the job talks we are familiar with today.
I think this context is important. When higher education was less common, we had other ways of judging quality. They seem better to me in comparison with today’s practices, but I think they had their own disgraces (old boy networks, editors clearly playing favorites, etc.). I think it is a waste of time to debate whether the old system was better or worse than today’s. It was surely different because higher education is very different than it used to be. It is, to a great extent, due to the scale. When everyone is expected to go to college, and the faculty positions are so many, judging quality becomes very different than it was in the past. And, it appears that we have not found a very satisfactory system for our current situation.
In many ways, industry deals with this better. The incentives are better – corporate politics is real, but does not compare with academia or think tanks in terms of the need for external “objective” measures of quality. It is hard to see how we can run a large scale industry of higher education research without some “objective” measures of quality, such as publications in top journals, citations, etc. I’m fed up with that and also wish people had the integrity and stamina to make their own judgements about quality and be willing to be held accountable for those. For example, I’d like to see promotion and tenure committees actually gauge the merit of someone’s research record without counting their publications and citations. One solid piece of research is worth many poor efforts, however many citations they’ve received.
But that misses the point as well. We are in a different situation due to the scale of the industry that academia (and, to some extent, the larger research community) has become. We don’t know how to gauge the quality of these millions of papers that are written – and, the consumers of our efforts have a much harder task to determining what is “true” or “trustworthy.” Like many aspects of life, I suspect that the pace and scale of change has exceeded our ability to control it. I like Dalton’s reference to cryptocurrencies. As our technologies (and I use that term broadly, to include statistical research) develop quickly and with increased scale, we poor humans are left with an unmanageable task. Simply put, how can we ensure that our systems reward “quality” research and not “poor” “fraudulent” or “over-hyped” research?
I offer no solutions, only my own skepticism. I don’t think this is a technical problem, though it has technical aspects. So, I don’t think the solution likes in a different publication system – just as I don’t think we will solve modern trust problems with cryptocurrencies. We have a social problem – and it will require human evolutionary adaption to the nature of what we’ve created.
Now, I fear I sound like a sociologist! (disclosure: I am an economist)
Dale, I always enjoy reading your thoughtful economists position on things.
I agree with you about the social aspect, but I note also that technology shapes society. What is relatively easy is also relatively common socially, and what is easy is shaped by technology.
Today, it’s relatively easy to *count* publications, and *count* citations, because we have computational algorithms to read papers and their metadata and probably there are a few people who add to all that by hand as well. In any case, unsurprisingly that’s what’s determined to be the way people should decide on academic advancement. It seems terribly silly to me, but I understand why it happens I think.
From an econ/finance perspective I think principal investigators are like fund managers. Probably some of them are worth their salt, but it takes decades to have a chance to figure out which, and for many others they have negative value compared to just randomly throwing money at “the market” in a diversified index type fund.
This is actually part of the insight that led me to recommending random number generation mixed in with the scoring in grant funding. mix in what information we have which we know is probably biased, with zero information zero bias noise, and the result will reduce bias and explore wider areas of the landscape. Basically, let people score their grants, and then add a random number to the grant uniformly distributed between 0 and 1/4 the maximum score…. rank the grants on the sum of the two, and fund the top ranked grants.
Anyway, I think there are at least a few different problems to solve. One major problem is the paywall problem. It has mostly been solved by the NIH who now requires grant funded research to be open access… but it’s still only open access to those things that were chosen to be published by journal editors… I like the idea that the archive of documents is very different from the filtered collection. Everyone should have a single automated zero-cost place to archive their thoughts, this isn’t because it will “solve the problems” of academia, it’s because *its the right thing to do*
Daniel
No disagreements from me. However, the idea that technology shapes society is the problem I am trying to focus on. It is undoubtedly true – but it is a problem, has been a problem, and I believe will become a bigger problem in the future. Why technology should shape society rather than the other way around is a serious question.
I think “technology shapes society” is just one aspect of “everything shapes everything else”. Sure, society shapes technology too. Basically when technology makes it easier to do stuff, people do more of that stuff… what should be the case, and I think comes into your “problem” framing is that when we desperately need certain things to be easier, such as housing the 100,000 homeless people in LA county, we need technology to make progress on those things, and society doesn’t always feed back in that way.
Personally, I think the most important technology we need to develop at the moment is policy technology. For example Universal Basic Income would revolutionize the economic conditions of far more than half of the US. Also policy related to things like granting and education policy etc would make a lot of progress towards getting better societal outcomes. “Technology” from the 1700s like Copyright and Patents are currently wielded as weapons against the vast majority of society… That should change too.
I’ve thought this was the way to go for ages. At least since Fernando Pereira was blogging about this approach in the mid-00s.
We shouldn’t forget the huge financial and time savings.
The one issue it doesn’t address is how all those department heads, deans, and provosts are going to rank their employees. I think the way they currently do it is a problem, so I see the publish-everything approach as a win here, too.
Martha:
These are interesting thoughts. As a guy, I’m not quite sure how useful my thoughts will be here. Rather than addressing the gender disparity issues, let me just say that I think it would be a mistake for any solutions to be framed in an adversarial way of authors vs. reviewers. One thing to remember is that reviewers are all doing it as volunteers. In saying this, I don’t mean that reviewers are always right or even always well-intentioned—I’ve seen lots of horrible behavior from volunteer reviewers, volunteer youth baseball coaches, and all sorts of other volunteers—I just think it’s a mistake for anyone to think of reviewers as major players in the journal publication game. The key players are the authors and the editors (also the publishers, tenure committees, etc.); the reviewers are just playing a role.
With that in mind, let me comment on the six suggestions listed above:
1) I don’t have a strong feeling about this one way or another, beyond thinking that all decisions should be done within a couple of months. I’d actually prefer the “publish everything as a preprint and then the journal is replaced by a recommender system” model, but then that just pushes it back one step, and the question is how fast can journals decide which already-published preprints to recommend.
2) I don’t know what to do with this one! Who decides that a review is “obviously biased or condescending”? And what does it mean, “should not be permitted”? Better would be to say that the journal editor should use his or her judgment and should not feel the need to respect the opinions of a review that he or she feels is in error or which shows poor judgment. But editors can already do that, right? I’m not so concern about bias or condescension; the real issue is content, not tone or perspective.
3) That’s fine: it’s good for referees to get credit for their work. I’d go further and make all referee reports public.
4) Yes, definitely. I think that’s already the case in most of the journals I’ve ever worked with.
5) Sure. I think at times it can be difficult to find anyone willing to take on the editor position. Open advertisement seems like a good idea for fairness and also to find more candidates for the thankless position.
6) Seems like a good idea. I’m not quite sure what is meant by “reviewed for bias” or who does the review, but it sounds like this could be helpful.
> The key players are the authors and the editors (also the publishers, tenure committees, etc.); the reviewers are just playing a role.
My experience with computer science journals is that the editor plays more of an adminstrative role and almost everything is decided by vote of reviewers. At least that’s how it’s always gone when I’ve reviewed or submitted papers. I’ve never once had an editor jump in with their own opinion other than as a moderator of reviewers. Maybe it’s because natural language is so interdisciplinary and so subspecialized, but I suspect not.
My experience reviewing grant proposals is different. In the DoD world, the program managers are all hands-on and I don’t even know if there are things like formal reviews. It’s all lobbying, all the time. In the NIH world, the program managers seem to be very proactive, but it’s all presented as if it’s a completely objective voting procedure based on independent reviews (the reviews are actually written after the committee meeting).
I had a grant once that NSF refused to review—linguistics said it was computer science and computer science said it was linguistics; it wound up getting a great set of reviews then being desk-rejected by the program manager in linguistics (a Chomskyan naturally, who didn’t think what I was doing was linguistics because it involved computation).
My first experience as an NSF reviewer had a very “strong” program manager who set aside the panel’s reviews and decided to fund some senior people who he said he knew did good work despite writing crappy proposals. Let them just give the money to their buddies if that’s what they want to do. I felt used, like my reputation was being used to whitewash what seemed to me like an unfair decision.
In mathematics and physics all our preprints are already published on the arxiv. The review system for publishing in a journal takes anywhere from 6 months to three years. By the time a paper appears in print it is likely to have many citations in papers applying the results. Even a highly cited and applied preprint can be rejected as “unimportant” after sitting on a referee’s desk for over a year. Referees who have a bias against an author (whether it be gender bias, national bias, subfield bias, or simply competitive bias against a rival) can and will sit on an article for as long as possible only to give it a scathing rejection without any discussion of the actual content of the paper. It is true that referees are volunteers and many people refuse to referee a paper or give a quick negative response if the result is uninteresting. There are only two incentives to agree to review a paper carefully: one is out of honest interest in the result and a desire to know whether it is correct or not, and the other is out of a desire to hurt the author in some way. The first kind of referee might be hard to find but if no one is interested in the result then the paper cannot be of much importance. Certainly once a paper has been on the arxiv for awhile then someone is likely to cite it and the editor can send the paper to that person for a review. Indeed I am often asked to referee papers I have already read and cited. In a perfect world there would be no biased referees but in reality, especially when the competition for funding and jobs is high and the reviews are completely anonymous, many reviews are questionable.
These problems do indeed exist, but my experience as Associate Editor of several (statistics) journals, having taken part in and organised hundreds of review processes, is much better than your posting suggests.
Nonresponse and not doing a review after having promised it is indeed a big problem leading to substantial delays, but reviewers are replaced at some point, so no reviewer can hold up the process by more than a few months. Also in case of reviews that don’t argue their (reject) recommendation properly I’ll try to find additional reviewers (which admittedly can take some time) rather than following such a recommendation blindly (I’d say less than 5% of reviews are of this kind). The process will never rely on a single reviewer. By the way, reviews are normally anonymous to the author but not to AE and main editor.
From my experience looking at many papers and their reviews I’d say that 90% of reviewers try to be as objective and fair as they can. Granted, there is unconscious bias, and obviously you can question my objectivity (some readers of this blog know that my relation to the objectivity concept is somewhat problematic;-). But still I have seen far more cases in which the review process improved a paper substantially than unfair reviews.
Christian,
Thanks for your comments. However, the question remains that your experience as associate editor of statistics journals may not accurately reflect what happens in editing journals in other fields. Unfortunately, I have not had personal experience with publishing in or refereeing for mathematics journals for quite some time. My experience earlier(before the arxiv!) with math journals is quite different from what Christina has described. I can see how the arxiv has made a big difference — for example, I twice had papers with errors accepted for publication (but not published without correction — in one case I found the error myself, in another someone to whom I sent a preprint found it); presumably these would have been caught if the papers had been on the arxiv.
Kaufman trained under Sternberg in some capacity, not sure if it was during grad school or postdoc.
Jordan:
I respect loyalty, but it can be taken too far.
Regardless of what everyone thinks of Robert Sternberg’s stint as an editor of a journal, Sternberg has written a compelling book on creativity, pointing out that the admission criteria of most graduate programs have emphasized analytical competencies unduly. Also, Sternberg acknowledges that exceptionally creative thinkers are given a very tough time. He doesn’t go into the reasons why they are. I can hazard some guesses. For one, exceptionally creative tend to be non-conformists/eclectic in lifestyle and thought; therefore, penalized in some respects for nonconformism. They don’t fit the mold, whatever that may be. Jealousy and ego are factors too.
Moreover, exceptionally creative is not going to be intellectually stimulated by the current pace of discussions in replication, granting that it is a necessary process. Some of the psychologists have issues over Robert Sternberg’s editorial policies. I can understand that. But then these same psychologists have excluded the exceptionally creatives outsiders that Sternberg has lobbied to have included. That can stifle their undertakings too. They want to ‘improve’ science on their own terms. I don’t recall this degree of blatant discrimination among academic researchers.
Lastly, the academic psychologists on Twitter stick pretty much to their own cliques, labs, and departments whatever. That is their prerogative I suppose. But it also points to the intellectual disconnect that it has engendered for the substantiveness of their research hypotheses and their results.
Sameera:
I think it’s great to support creative ideas, and I am glad that Robert Sternberg, Susan Fiske, etc., are committed to supporting creativity. I say this in all sincerity. They are using their leadership in the psychology community to stand up in support of new ideas that might work or might not work but which represent potentially path-breaking lines of research.
My problem is that Sternberg, Fiske, etc., don’t recognize the flip side of this, which is that, by necessity, many or most innovative ideas fail. That’s the whole point, right? To go off the beaten track, try something new. But there’s a reason why the beaten track is well trodden. I think we should (a) encourage new and innovative thinking, and (b) recognize when we’ve reached a dead end.
An example is power pose, an innovative idea which various people in the psychology establishment such as Susan Fiske, Steven Pinker, etc., have gone all-in on. This was a cool experiment, or set of experiments, that ended up not working. No shame in that: we’ve all had ideas that didn’t work out. If you never run down a blind alley, you’re not checking out enough alleys.
So, again, for Sternberg: Encouraging innovation is great. Acting to suppress criticism of ideas that didn’t work out: not so great. We show respect for innovation by not acting like it’s shameful for researchers to try out ideas that didn’t happen to work out. We can learn from our mistakes—but only if we recognize the mistakes we make.
+1
Innovative work that stands the test of time is potentially orders of magnitude more important than replication. Novelty for the sake of novelty should not be encourage.
Andrew,
I wholeheartedly accept your observations. We all make mistakes whether we are exceptionally creative or less so. Nor do we want to mistreat others or be mistreated either. Consistently, it’s been pointed out that exceptionally creative people are given a rough time for the reasons that I posted. As I have pointed out, John Rawls, himself, cautioned his colleagues that they should give credit where credit is due. He had specific examples in mind, which I hope to write about.
I lean to Raymond Hubbard’s view that we have been engaged in a ‘philosophically naive form of Hypothetico-Deductivism (HD) as the Scientific Method permits and exacerbates the other complementary causes. According to Hubbard, HD is erroneously thought to legitimize the inappropriate use of NHST and other methodological flaws.’
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5136553/
Hubbard’s characterization is not new to me. Back in Boston, several scientists offered a similar take. I think it’s the predisposition analytically to ‘binariness’ and as others point the habit of ‘dichotomization’ which is at work. Thus resulting in false dichotomies.
Several of them were looking to incorporating expertise that may be able to contribute other modes of thinking as they themselves had been dismayed at the progress in medical treatments & machine learning. So when you suggest that ‘there is a reason why the beaten track is well-trodden’, I would ask to identify in which context you mean. This then calls attention to the importance of conducting base rates and may also better identify the contexts when we may have reached a dead end. It’s this ‘naive ‘HD’ which has been dubbed falsely ‘innovative’. Let’s be clear on this. In fact, it’s the independent creative types that have posted the best questions which are co-opted by academics.
This naive form of HD analytical reasoning was in full form in the ‘power pose’ analysis. Had the authors had a better understanding of hormone cycling and kinesiology, the authors’ hypothesis would have been modified considerably. Only within the last two years had I come across the Power Pose study. Perhaps it would have been more productive to explore the effects of increasing human growth hormone via some high interval exercises.
Lastly, Sternberg goes at length to differentiate types of creativity. I still have his book. So I’ll look up how he categorizes ‘creativity’ further.
With the amount of publications produced by academia and the lack of quality of most of the publications, I am not sure eliminating the gatekeepers, however flawed they may be, is going to help. It’s not realistic for individual consumers of research literature to have to evaluate all publications relevant to their research, going through the papers carefully, sifting through the data if available, even replication if needed etc.
What we need is better gatekeeping. The incentive is the key. Reviewers and editors should be compensated for their work financially. Review process should be made public, so that reviewer and editor contribution can be seen by the community and quality reviewer/editor can be rewarded. Not everyone in academia needs to publish. Imagine if some specialize in reviewing, become sought after, and have their salary paid for the service (there could be indirect cost similar to grants too to incentivize the institutions). If replication work should be rewarded professionally, review work certainly should be too.
Yyw:
I don’t disagree with you. I just think that all this reviewing should be done after publication. Put everything on Arxiv-like servers, then “journals” can do the review process, where the positive outcome of a review is “endorsement,” not “publication.” Post-publication reviewers can even ask for changes as a condition of endorsement, in the same way that journals currently ask for changes as a condition of publication.
The advantages of publishing first, reviewing later, are: (a) papers aren’t sitting in limbo for years during the review process, and (b) post-publication review can concentrate on the most important papers, rather than, as now, so much of the effort going into reading and reviewing papers that just about no one will ever read.
Andrew,
I think this could work, although I am not sure if under this model journals can sustain themselves financially. Maybe they should be removed as the middlemen. Instead, a funding agency will host the submitted works publicly available to all and hand out endorsement. Funding mechanism can also be modified to devote a portion of resources to reward quality work directly instead of all going to promise of novel works.
Yyw:
I think the big challenge is to how to redirect the zillions of hours of free labor that are currently going into reviewing submitted journal articles. If we blow up the system and start again, where will all that labor come from? Who will pay for it? Right now a lot of it is done through some sense of obligation.
“done through some sense of obligation”
Yes, I try to review as few articles as possible, which I partly accomplish by having strict requirements such as necessitating a preprint of the paper, open code/data, etc.
But sometimes I get a special request to review a paper which meets my requirements so I don’t really have any excuse not to review it.
One thing I started doing is posting my reviews on PubPeer. This has the added benefit of upsetting the journal so that they don’t send me more review requests that I’m guilted into doing.
https://blog.pubpeer.com/publications/E8D33049820EC7ADDF036DEB8075E1#1
Jordan:
Wow—that’s a long review! My reviews are a lot shorter. Of course there’s a limit to how much I can write in 15 minutes—especially considering that I have to spend part of that 15 minutes to read the article under review.
Wow, it would be rare that a math paper could be refereed in 15 minutes. A long paper is likely to take hours to read thoroughly.
I don’t do very much pre-publication reviewing, but as you know I’ve done a bit of what can be considered post-publication reviewing.
I’m hesitant to review papers because I believe journals are the single biggest problem facing science and I’d like to see the journal system collapse, but I must admit it feels better to improve a paper rather than point out a paper is garbage.
When you get a paper retracted you aren’t really contributing to scientific knowledge–you are just getting a paper removed which never should have been published. Essentially you are just returning things to their initial state, kind of like picking up some trash off the street. Sure, if people were trying to build on that work the retraction is important, just as someone could slip and fall on some trash in the street, but it’s really annoying cleaning up other people’s messes.
If you spend enough effort in a review you almost feel like an author and are proud of the final product.
I like your idea of endorsements. I actually think there should be multiple levels of endorsement, like Standard & Poor’s credit rating, or Michelin Stars. I also like the idea of just having the NIH take care of this, but perhaps there could also be a free market where the current journals provide the ratings. If a journal gives a paper a high endorsement and it proves to later be garbage maybe we don’t take any of that journal’s endorsements seriously anymore.
“If you spend enough effort in a review you almost feel like an author and are proud of the final product.”
This is in line with why i think reviewing in the current journal-editor-reviewer model makes little sense.
I think when a reviewer has contributed enough to make the paper better, they should become co-authors as i tried to explain on this blog before:
https://statmodeling.stat.columbia.edu/2018/07/25/journals-refereeing-toward-new-equilibrium/#comment-809001
Also i reason, if, how, and why a paper gets used and cited is a large part (if not all) of the peer-review that is needed in my opinion and reasoning.
Jordan said:
“If you spend enough effort in a review you almost feel like an author and are proud of the final product.”
Anon responded:
“This is in line with why i think reviewing in the current journal-editor-reviewer model makes little sense.
I think when a reviewer has contributed enough to make the paper better, they should become co-authors”
Instead of what Jordan said, I’d say,
“If you spend enough effort in a review, you almost feel like the author is your student, and you are proud of having guided them well.”
But I think both Jordan and Anon’s comments involve the fact that different fields have different customs about joint authorship. For example, in biology, it is the custom for a Ph.D. advisor to be last author on publications based on their students’ Ph.D. dissertations — whereas in math, this is usually considered bad form.
Also, (in math at least), it is common to thank an anonymous referee for making suggestions that made the paper better, pointing out a relevant reference, etc.
Nice! I’ve never gotten a contract as part of reviewing, so it’s hard to see what law you could be violating by posting your review.
I once contacted an author of a paper I was reviewing. I was just trying to figure out what the author was saying without having to wait months to do this all with the rest of reviews. I wanted to make a strong accept recommendation (the endless revise-and-resubmits my fellow reviewers recommended drove me crazy). The author did not appreciate it. I think they thought it violated some kind of vow of anonymity or something. I figured I had the right to out myself at least!
I didn’t see any rules about posting the review. This has become a hot topic given that a lot of papers are available as preprints now, so it makes sense to post your review of a paper which people are already reading, but you can’t comment on Bradley Love’s preprints or he’ll write multiple blog posts about how rapey it is.
The instructions for reviewers did say my identity was supposed to remain unknown to the authors, so obviously that got violated, but I didn’t really sign anything when I accepted the review, and indeed I had to go out of my way to even find what the rules were.
Apparently they are calling my taboo posting of my review an open review:
https://twitter.com/tnarock/status/1140591237235433473
It’s not like anyone’s paying for it now. We volunteer the labor and the Elseviers and Taylor and Francis’s of the world collect the royalties on the finished products.
I’d argue most of the labor goes into the decision making, which isn’t very useful. If we could somehow connect people to papers they want to review, it’d be better. I send everyone comments who sends me something, but if 10 people see this and send me papers in the next week, I’m not going to be able to keep up.
Not being able to get feedback from a journal will put people outside of top university programs at a disadvantage if they get valuable feedback from reviews. In such programs, you have a lot of knowledgeable peers. That’s why I returned to academia after 15 years in industry. In computer science, the employees at the big ad companies have a much better concentration of talent than any university; sort of like back in the day when the phone company had a huge proportion of the best computer scientists.
Bob:
I agree on “It’s not like anyone’s paying for it now.” That’s why I’m saying it will be difficult to blow up the current system and restart. The current system runs on zillions of hours of unpaid labor by people like me, who review papers out of a sense of obligation. Once we lose the norm of saying Yes to review requests, the whole system falls apart.
Morally obliged to help the scientific community? Yes. Morally obliged to help journal editors? Not necessarily.
I agree with Andrew that reviewing is best done after publication on arxiv-like servers. But to implement this requires the notion of “journal” to evolve. In particular, “for-profit” journals would not be sustainable, so professional society journals would, I think, practically speaking, be the only ones that could viably make the transition to “endorsement” rather than “publication”. (I think even “vanity press” journals would not survive — since everyone would soon know that they are vanity press journals, so their “endorsement” would not be taken seriously and they would, I hope, soon find themselves out of business.)