Who’s Afraid of Preprints? Looking at the Origin and Motivation Behind arXiv for Clues as to Why It’s so Successful

4th March 2016

| Phill Jones

Kitchen-Scooper-Large A couple of weeks ago, I explored a theme that emerged from the recent Researcher to Reader conference in London. Specifically, I asked the question as to whether we should separate out the roles of dissemination and accreditation in scholarly publishing. In a sense, the question is at the heart of the open science movement, as advocates seek to find new and faster ways of communicating research outputs. The danger is that by focusing on the dissemination aspect of scholarly communication, we run the risk of ignoring accreditation, or rather, the quality control mechanisms that enable it.

Last week, an article by Ewen Callaway and Kendall Powell in Nature News discussed the ASAPBio conference, which is dedicated to finding a way to make preprints do for biology, what they’ve done for other disciplines. The article talks about bioRxiv, the life science preprint server which was founded at Cold Spring Harbour in 2013 and modelled after arXiv.

While bioRxiv has been growing steadily since its foundation, with around 200 submissions per month, it has a long way to go to catch up with arXiv, which boasts almost 9,000 per month, with a total of over a million articles so far. What accounts for this difference? Is it just the fact that bioRxiv is the new kid on the block or is there something more at work?

One key difference between the two projects is the communities that they serve. As this article by Jocelyn Kaiser in Science Magazine pointed out, critics claim that there are cultural differences between biology, for example and physics. I’d actually go a little further and say that arXiv was designed to automate a process that was already going on in fields like high energy physics (HEP). It therefore owes its inception to a community that it didn’t have to be sold to in the way that bioRxiv needs to be. This article by Dawn Levy, Stanford University’s news service science writer outlines the way in which HEP physicists in particular have been distributing articles among themselves prior to publication since the 1960s, in order to get feedback and to communicate more rapidly. As Heath O’Connell, HEP database manager forThe Stanford Linear Accelerator Center (SLAC) said at the time:

“The physics community had a really rapid adoption of this because in a sense it was just an evolutionary process rather than a revolutionary one,”

So the arXiv wasn’t initially an open access effort; it was about dissemination but also partly about quality control. This culture is in stark contrast with modern biology where in many cases, the results of work are kept secret for fear that somebody will steal the idea, rush out an article and get that high impact prize.

This begs the question as to why this difference exists? How can somebody working in big physics get away with showing their data to all their competitors a couple of years before it is recorded in the citable version of record, but biologists can’t. Is it partly because in fields like HEP, or plasma physics, the instruments (EG Joint European Torus, NIF or CERN) are unique, making it pretty obvious where the data came from? I don’t think that’s the answer because computer scientists and even economists make use of preprints. Is it because the communities are smaller? Perhaps, but I certainly know of people who have a reputation for sharp practice and scooping in the life sciences. It doesn’t seem to hurt their careers but their peers know who they are. Is excessive competition in the life sciences for money and position?

Maybe part of it is that the risk is overblown, or perhaps it’s more that when people build on work in the physical and computer sciences, it isn’t seen as a bad thing and it doesn’t harm the originator’s career. Whatever the reason, in order for preprints, or other forms of open-science to be truly successful in the life sciences, this is an issue that shouldn’t be ignored. The reasons why biologists are afraid of being scooped need to be identified and addressed either by convincing researchers that the rewards outweigh the risks, or changing the incentive structures so that they do.