OpenAI published a response to The New York Times’ lawsuit by alleging that The NYTimes used manipulative prompting techniques in order to induce ChatGPT to regurgitate lengthy excerpts, stating that the lawsuit is based on misuse of ChatGPT in order to “cherry pick” examples for the lawsuit.
The New York Times Lawsuit Against OpenAI
The New York Times filed a lawsuit against OpenAI (and Microsoft) for copyright infringement alleging that ChatGPT “recites Times content verbatim” among other complaints.
The lawsuit introduced evidence showing how GPT-4 could output large amounts of New York Times content without attribution as proof that GPT-4 infringes on The New York Times content.
The accusation that GPT-4 is outputting exact copies of New York Times content is important because it counters OpenAI’s insistence that its use of data is transformative, which is a legal framework related to the doctrine of fair use.
The United States Copyright office defines the fair use of copyrighted content that is transformative:
“Fair use is a legal doctrine that promotes freedom of expression by permitting the unlicensed use of copyright-protected works in certain circumstances.
…’transformative’ uses are more likely to be considered fair. Transformative uses are those that add something new, with a further purpose or different character, and do not substitute for the original use of the work.”
That’s why it’s important for The New York Times to assert that OpenAI’s use of content is not fair use.
The New York Times lawsuit against OpenAI states:
“Defendants insist that their conduct is protected as “fair use” because their unlicensed use of copyrighted content to train GenAI models serves a new “transformative” purpose. But there is nothing “transformative” about using The Times’s content …Because the outputs of Defendants’ GenAI models compete with and closely mimic the inputs used to train them, copying Times works for that purpose is not fair use.”
The following screenshot shows evidence of how GPT-4 outputs exact copy of the Times’ content. The content in red is original content created by the New York Times that was output by GPT-4.
OpenAI Response Undermines NYTimes Lawsuit Claims
OpenAI offered a strong rebuttal of the claims made in the New York Times lawsuit, claiming that the Times’ decision to go to court surprised OpenAI because they had assumed the negotiations were progressing toward a resolution.
Most importantly, OpenAI debunked The New York Times claims that GPT-4 outputs verbatim content by explaining that GPT-4 is designed to not output verbatim content and that The New York Times used prompting techniques specifically designed to break GPT-4’s guardrails in order to produce the disputed output, undermining The New York Times’ implication that outputting verbatim content is a common GPT-4 output.
This type of prompting that is designed to break ChatGPT in order to generate undesired output is known as Adversarial Prompting.
Adversarial Prompting Attacks
Generative AI is sensitive to the types of prompts (requests) made of it and despite the best efforts of engineers to block the misuse of generative AI there are still new ways of using prompts to generate responses that get around the guardrails built into the technology that are designed to prevent undesired output.
Techniques for generating unintended output is called Adversarial Prompting and that’s what OpenAI is accusing The New York Times of doing in order to manufacture a basis of proving that GPT-4 use of copyrighted content is not transformative.
OpenAI’s claim that The New York Times misused GPT-4 is important because it undermines the lawsuit’s insinuation that generating verbatim copyrighted content is typical behavior.
What You Cannot Do
- Use our Services in a way that infringes, misappropriates or violates anyone’s rights.
- Interfere with or disrupt our Services, including circumvent any rate limits or restrictions or bypass any protective measures or safety mitigations we put on our Services.
OpenAI Claims Lawsuit Based On Manipulated Prompts
OpenAI’s rebuttal claims that the New York Times used manipulated prompts specifically designed to subvert GPT-4 guardrails in order to generate verbatim content.
“It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate.
Even when using such prompts, our models don’t typically behave the way The New York Times insinuates, which suggests they either instructed the model to regurgitate or cherry-picked their examples from many attempts.”
OpenAI also fired back at The New York Times lawsuit saying that the methods used by The New York Times to generate verbatim content was a violation of allowed user activity and misuse.
“Despite their claims, this misuse is not typical or allowed user activity.”
OpenAI ended by stating that they continue to build resistance against the kinds of adversarial prompt attacks used by The New York Times.
“Regardless, we are continually making our systems more resistant to adversarial attacks to regurgitate training data, and have already made much progress in our recent models.”
OpenAI backed up their claim of diligence to respecting copyright by citing their response to July 2023 to reports that ChatGPT was generating verbatim responses.
We’ve learned that ChatGPT’s “Browse” beta can occasionally display content in ways we don’t want, e.g. if a user specifically asks for a URL’s full text, it may inadvertently fulfill this request. We are disabling Browse while we fix this—want to do right by content owners.
— OpenAI (@OpenAI) July 4, 2023
The New York Times Versus OpenAI
There’s always two sides of a story and OpenAI just released their side that shows that The New York Times claims are based on adversarial attacks and a misuse of ChatGPT in order to elicit verbatim responses.
Read OpenAIs response:
OpenAI and journalism:
We support journalism, partner with news organizations, and believe The New York Times lawsuit is without merit.
Featured Image by Shutterstock/pizzastereo