OpenAI, the maker of the popular chatbot ChatGPT, has accused the New York Times of violating its terms of service by using deceptive prompts to make the chatbot generate exact copies of the newspaper’s articles. The accusation comes amid a lawsuit filed by the New York Times against OpenAI and Microsoft, alleging that the companies infringed its copyright by using millions of its articles to train ChatGPT and similar models.
According to OpenAI, the New York Times’ prompts, which generate exact copies of the New York Times’ content, violate the terms of use of its language models. According to the complaint, the New York Times prompted ChatGPT models with the beginnings of its original articles, causing the model to complete the text as closely as possible1.
For example, the New York Times allegedly prompted ChatGPT with the following text:
The New York Times has sued Microsoft and OpenAI, claiming the duo infringed the newspaper’s copyright by using its articles without permission to build ChatGPT and similar models. It is the first major American media outfit to drag the tech pair to court over the use of stories in training data.
The chatbot then produced the following text, which is identical to the first paragraph of an article published by The Register2:
The lawsuit, filed in the Southern District of New York, claims that OpenAI, the maker of generative AI chatbot ChatGPT, and its financial backer Microsoft infringed the Times’ copyrights by building training datasets containing millions of copies of its copyrighted content.
OpenAI claims that this is a clear case of plagiarism and that the New York Times is exploiting a loophole in the chatbot’s design. ChatGPT is trained on a large corpus of text from the internet, including news articles, books, blogs, and social media posts. The chatbot tries to generate text that is coherent, relevant, and consistent with the given prompt. However, it does not check the originality or the source of the text it generates, nor does it attribute the text to its authors.
OpenAI argues that the New York Times is abusing the chatbot’s capabilities and violating its ethical principles. The company says that its language models are intended for research, education, and entertainment purposes, and that users should not use them to create or distribute misleading or harmful content.
The New York Times’ lawsuit, filed on December 27, 2023, in Manhattan federal court, claims that OpenAI and Microsoft should be held responsible for “billions of dollars” in damages. The lawsuit alleges that the companies used millions of articles published by the New York Times without its permission to make ChatGPT smarter, and that the chatbot is now competing with the newspaper as a trustworthy information source.
The lawsuit also claims that ChatGPT is harming the New York Times’ business model by providing readers with free access to its content. The lawsuit cites examples of ChatGPT generating “verbatim excerpts” from New York Times articles, which cannot be accessed without paying for a subscription. The lawsuit also accuses ChatGPT of diverting traffic and revenue from the New York Times’ website by producing results taken from its articles without linking to them or including referral links.
The lawsuit is the latest in a series of legal actions against OpenAI and Microsoft over the use of ChatGPT and similar models. In September 2023, a group of US authors, including George R.R. Martin and John Grisham, filed a class-action lawsuit against the companies, alleging that their works were used without their consent to train ChatGPT. In July 2023, comedian Sarah Silverman also sued the companies, claiming that ChatGPT copied her jokes and violated her trademark rights.
OpenAI and Microsoft have not yet responded to the New York Times’ lawsuit, but they have previously defended their use of ChatGPT and similar models. The companies have argued that their language models are protected by the fair use doctrine, which allows the use of copyrighted material for purposes such as criticism, comment, news reporting, teaching, scholarship, or research.
The companies have also claimed that their language models are beneficial for society and innovation, as they enable new applications and services that can improve communication, education, and entertainment. The companies have also stated that they are committed to ensuring the ethical and responsible use of their language models, and that they have implemented safeguards and policies to prevent misuse and abuse.
Despite the lawsuit, OpenAI has expressed its willingness to work with the New York Times and other media outlets to find a mutually beneficial solution. OpenAI has said that it respects the New York Times’ journalism and that it hopes to collaborate with the newspaper on projects that can advance the field of natural language processing and benefit the public.