Dark Matter: What’s missing from publishers’ policies on AI generative writing?

Stock image of an AI robot writing — *Stock image generated using AI: “An artificial intelligence robot writer creating generative AI writing”. Source: Adobe.*

Guest post by Avi Staiman

Disclaimer: I wrote a draft of the article on my own and then used ChatGPT4 to help rewrite passages more clearly, make counterarguments to refine my own, check for coherence and more.

The Bookends of AI for Writing and Revisions

In the year that has passed since the release of ChatGPT, publishers have been walking a tightrope in their response to the rise of Large Language Models and the policies they have proposed dictating proper use by researchers in the context of writing and revisions. Many publishers have proposed an initial approach which sets up two bookends of what absolutely is and isn’t allowed, leaving considerable dark matter in the middle for researchers to navigate on their own.

On one end of the spectrum, there is broad consensus that Generative AI should be kept as far away from authorship as possible, so as to not to undermine the weighty responsibility assumed by authors when publishing their work (while simultaneously limiting the liability publishers assume when publishing and disseminating the findings). In parallel, publishers have set up a second bookend wherein they tolerate (and sometimes even encourage) the use of AI for revision/editing/proofreading to help EAL (English as an Additional Language) authors improve the language of their manuscripts prior to publication.

What these policies fail to address are the myriad use cases that fall somewhere in between the two ends of this spectrum, where researchers want to get the most benefit from these tools in a creative and responsible manner to be able to streamline their research writing (what I call the ‘dark matter’).

I contend that the heart of the scientific endeavor is the research itself and not the writing form (aside from a few specific fields such as Literature Studies). The form by which research results are communicated plays a very important functional role in pursuing effective research communication, but shouldn’t be confused with the primary focus of the work: the research itself. As a result, it strikes me as ill advised to overly limit and restrict responsible writing forms and tools if they further the goal of research dissemination.

The Gap in the Middle

On the ground, most researchers aren’t experimenting at either bookend but somewhere in the middle. Here there is a gap in policy, currently being filled by the request for authors to disclose, disclose, disclose (more on that later).

A few practical examples of use cases I have encountered in my work include:

Using Quillbot, a paraphrasing tool, to reword source material for a literature review without being flagged for plagiarism
Writing a first draft of research using ChatGPT and then revising or completing post-editing
Writing a draft and then having ChatGPT rewrite sections
Using AI tools such as Scite or ScholarAI to suggest potential citations the author may have missed
Reviewers inputting draft of article or grant proposal to LLMs to ideate around peer review.

Researchers I speak to are eager to play around with AI writing tools but are unclear about what publishers allow and how they can use them in a manner that is both responsible while not neutering their potential benefits. To clarify: I am addressing cases where researchers are not trying to fully automate the research writing or review process and relieve themselves of responsibility, but rather using AI technologies to augment their own work and play the role of critical reviewer. While there have been a few embarrassing cases of researchers simply lifting text from ChatGPT and hitting copy/paste, I presume that most researchers are carefully reviewing, revising and correcting issues that arise when using AI tools.

Crossing the Murky Abyss

Given the difficulty to try and account for endless potential use cases of AI in the research workflow, it makes sense for publishers to develop more general guidelines and not try and dictate policy for each and every specific tool that comes to market (not to mention the fact that even coming up with consensus as to the definition of Artificial Intelligence turns out to be a Herculean challenge). However, these challenges don’t alleviate the responsibility that publishers have to authors to clearly define and communicate their policies. STM has taken an important step in that direction by publishing a new white paper on Generative AI Scholarly Communications.

The approach that STM and many publishers have taken to this challenge can be summed up in one word: disclosure. In short, publishers encourage and even require authors to disclose the use of AI technologies in their research and writing.

Disclose, disclose and disclose again

When and how this should be done varies from publisher to publisher and even journal to journal but the principle stays the same. However, transparency and disclosure can only be effective when it is clear to authors what they are and aren’t supposed to be doing in the first place. Otherwise, authors quickly become confused with what they should be disclosing, when and why.

In addition, there are numerous practical challenges with relying on the declaration principle alone.

Without a clear definition of Artificial Intelligence we can’t hold researchers accountable for declaring when they used it.
Many tools incorporate Artificial Intelligence without users ever knowing it. Will authors know if or when they are using AI?
Even simple editing tools, such as Grammarly for example, use AI to train their language models. Do publishers expect authors to declare their use of language tools?
GenerativeAI tools such as Bing are already being seamlessly incorporated in search engines. Asking authors to declare their use of AI will soon be more or less the equivalent of asking them how they used Google for their research.

Even if declarations are made, it isn’t clear what the goal of these declarations are. If the goal is to ensure the veracity of information or method used, declarations won’t help much with replicability as LLM outputs vary from one inquiry to the next. If the goal is to limit liability or accountability around the output, then we should wonder if these tools should be used at all or how authors verify and take responsibility for the AI outputs. Disclosure on its own doesn’t require authors to demonstrate how the verify the reliability of the outputs.

Accountability & Authorship

Researchers don’t only need to concern themselves with publisher policy but also the opinion of their fellow researchers who determine whether or not their article will be published or grant will be funded. Will they be judged negatively for using AI? What are guidelines for reviewers? What if the reviewers have no knowledge of different AI technologies and how they work? Might authors prefer to hide their use of AI for fear of being penalized by reviewers who look down on their use of these tools?

On a very practical level, many of these policies have been translated into practice already through adding (yet another) declaration from publishers for authors to complete. We will likely create a situation where we force researchers to add yet another step or include boilerplate language that is not properly understood or acted upon.

We should also ask what closely moderating AI authorship means for how AI is perceived and what role it has in research papers. The main argument against AI authorship is the fact that AI can’t take responsibility for the work it publishes. As a result, publishers across the board have banned listing AI as an author. But they could have taken a different approach: in much the same ways different authors of a paper play different roles in the research and writing process, AI could be given a role or title of authorship, or author contribution, with the ‘human’ authors taking on the responsibility for the output. Maybe AI advances require us to rethink the CRediT taxonomy authorship criteria (let’s not lie to ourselves that every researcher who works on a mega-article is ready and willing to take full responsibility for the output).

By making the use of AI tools taboo (even if not forbidden), we run the risk of the ‘declaration policy’ backfiring and having authors use AI and pass it off as their own work, so as to not be marked with a ‘Scarlet Letter’. If these same authors don’t realize the potential consequences, it could become a dark stain on the veracity of the scientific record, leading to more retractions and undermining faith in science.

Moving From ‘Author’ back to ‘Researcher’

From the time I joined the industry a decade ago, I have found the use of the adjective ‘author’ very strange when the term ‘researcher’ or ‘scientist’ is much more natural. And it’s not just an issue of semantics. By turning our researchers into authors, we risk diminishing the importance of the research they conduct and overemphasize the importance of the role authorship plays.

I want to suggest that, at least when it comes to writing, our focus should be less on the tools used and more on having authors demonstrate the validity, veracity and novelty of the research alongside the methods used to receive the results. It is time to get back to treating academics as researchers and not authors.

If researchers want to prompt GPT to get a first draft of their article and then work on revising and editing it themselves, so be it. If they prefer to write a draft themselves and then feed it into an LLM for feedback, critique, edits and revisions, more power to them. We should be guiding them to do this responsibly, instead of putting our hands up and making it the researcher’s responsibility to declare, if, when and how they deem appropriate.

A New Framework for AI Policy

I expect that the first point of pushback from the publishing industry is that this approach would leave the door open for bad actors or paper mills to run free. At first glance this concern is real and legitimate as research fraud seems to be increasing exponentially. However, this begs a separate question: Are we writing guidelines for good actors or bad ones? Should our approach be aimed at policing all authors to weed out the bad apples or at the more typical researcher who wants to act in good faith but isn’t exactly sure how to do so. Bad actors likely won’t be carefully reading ethics statements on publisher websites while honest researchers who are told to ‘declare…declare….declare’ are left confused.

My suggestion would be three-pronged. First, publishers should see their role as educational and teach their researchers the benefits and pitfalls of GenerativeAI tools in the research writing process (see a good starting point from ACS here). Second, publishers should develop a list of approved tools or use cases without the need for declaration. If an AI tool can help create a bibliography from in-text citations that the author can verify, do we really need to know what software he used? Finally, we should only suggest a declaration when the use of the AI tool can be a potential limitation to the research. Authors will need to do more data sharing and verification and less declaration.

With AI becoming an ever-increasing part of the research workflow it is important that responsible tools are embraced and authors are encouraged to continue experimenting while continually monitoring for research veracity, replicability and integrity throughout.

Acknowledgement: I would like to thank John Hammersley, Chhavi Chauhan and others from the Digital Science team who reviewed and published this post.