Speaking of careless AI usage, the open-access archive for research papers, ArXiv, is updating their Code of Conduct to account for generative AI. Usage is not banned, but slop is, which results in a one-year ban. Thomas Dietterich, chair of the computer science section, posted the update on X:
Our Code of Conduct states that by signing your name as an author of a paper, each author takes full responsibility for all its contents, irrespective of how the contents were generated.
If generative AI tools generate inappropriate language, plagiarized content, biased content, errors, mistakes, incorrect references, or misleading content, and that output is included in scientific works, it is the responsibility of the author(s).
We have recently clarified our penalties for this. If a submission contains incontrovertible evidence that the authors did not check the results of LLM generation, this means we can’t trust anything in the paper.
The penalty is a 1-year ban from arXiv followed by the requirement that subsequent arXiv submissions must first be accepted at a reputable peer-reviewed venue.
Examples of incontrovertible evidence: hallucinated references, meta-comments from the LLM (“here is a 200 word summary; would you like me to make any changes?”; “the data in this table is illustrative, fill it in with the real numbers from your experiments”)
It seems like it’d be helpful to put this on the actual ArXiv site instead of just floating it out there on X.
Still, necessary, so good on them. They’ll use a detection algorithm to flag papers. It’ll be interesting to see how it holds up as mistakes continue to look less like slop.
Visualize This: The FlowingData Guide to Design, Visualization, and Statistics (2nd Edition)
