Shoddy Preprints vs. Agile Biology Development

Early Access to Raw Scientific Results or Shoddy Preprints? Agile Biology Development or Reckless Endangerment?

Today, I read a post that L made on fb today about the issue of preprints in various bio-related fields. The worry is that people will preprint shoddy work online to get priority[0], followed by revising or ‘correction’ for publication.

If you’ve been reading this blog (or just the ‘Agile’ category) for a while, you’ll probably know that I am generally in favour of agile as well as Agile practices. My view is that the more communication and more frequent communication (up to a point)[1] you have between participants (in this case the scientific community), the more useful and better aligned the overall product will be with whatever the goal might or should be[2]. This means people can build on each others’ work more easily and quickly.

With code, it’s pretty easy to build on something someone else has done. A well-written set of unit tests will make sure that goes mostly smoothly. But how do you do this with research without the peer-review?

You can think of peer-review as the testing and release process for a minimum viable product of research, most commonly released as a scientific paper. But papers can take months to write, to go through review, to be published.

So, you have a huge body of researchers working on similar things, but only sharing notes every year or so[3].

So, you could have them send their raw results (untested code) around to each other as soon as they’ve acquired the data[4]. Currently, this is done in small groups of friends or collaborators, if that. What if they posted their raw results, and anyone in the world could download and comment[5]? As things became more refined, or others added their agreeing or contradictory results, the community as a whole could very quickly zero in on what was actually going on.

You would also have all the documentation you needed to show who had priority, and all of whom had contributed along the way. We would probably need to rethink a bit how we gave credit, as the above method could easily replace a lot of scientific publishing.

We would also have to rethink how we gave credit for careful work, as the above system would tend to reward quick work over careful work. But social media can probably show us the way here, with different researchers having some type of time-delayed ratings for how often their results are ‘accurate enough’.

Science may progress faster, and it would be difficult to grind up more grad. students than it does right now. Being part of a huge community who cared might help grad. students (and post-docs) a lot more than you might think.

I wanted to close with an example you’re probably heard of which may help illustrate how this might work:

You’re probably familiar with Watson and Crick, and their work uncovering the Double Helix of DNA. You may not know that the X-ray structure photo which confirmed the theory that DNA was a double helix was made by Raymond Gosling, under the supervision of Rosalind Franklin.

What happened was Gosling returned to his former supervisor, Maurice Wilkins, who showed the photo to Watson and Crick without Franklin’s knowledge or consent. They proceeded to publish their famous ‘double helix’ paper with a footnote acknowledging “having been stimulated by a general knowledge of” Franklin and Wilkins’ “unpublished” contribution[6], followed by Wilkins’ and Franklin’s papers[7].

Note also that all three of these papers appeared with no peer review, and Wilkins’ boss went to the same gentleman’s club as one of the editors of Nature.

So, if we’d had instant world-sharing of preliminary results, Gosling would have posted his photos. Most people would not recognize the significance. Pauling and Corey, Watson and Crick would have all jumped on it. Franklin might have been persuaded to comment on what she thought before she was 100% sure. Wilkins might have come out of his shell sooner[8].

Science would have been done faster. More credit would have gone to the people who did the work. More credit would have been spread around to the people thinking about all of this. More of the conversation would be out in the open.

Science would have been done faster. Science might have been done better.

[0]”Who gets credit?” So important in a ‘publish or perish’ culture, but also important for the history books. The example below (above?) may elucidate some of these issues.

[1]I think most people top out at about once per day, but on a well-functioning team, on some types of tasks, this can be every few minutes, or seconds.

[2]Yes, there are arguments here about how some researchers should be left alone to do their work, because they’re working on things everyone else thinks are silly or wrong. They are outside the scope, and I don’t see them being as affected by preprints, which are much more likely to be an issue in extremely competitive fields. I suspect most researchers, like most writers and musicians, probably like most people, would be happy to have other people paying attention and caring about what they do.

[3]I use 1 year because it’s a nice round number, and because about 41% of scientific papers have an author on it who publishes a paper once a year or more.

[4]Or first draft…This will likely take some back and forth to discover the best use of peoples’ time.

[5]Note that this is basically what the genome sequencing centers do, and that project seems to be going reasonably well.

[6]The linked text is a direct quote from the Wikipedia article, which has two level of quoting inside.

[7]Franklin’s paper was only included after she petitioned for its inclusion.

[8]The backstory on this is fascinating. The linked articles are probably a good start, but I’m guessing many books have been written on this. Teasing apart what actually happened 60 years later is nontrivial.

Leave a Reply

Your email address will not be published. Required fields are marked *