On September 19, 2023, OpenAI found itself entangled in yet another legal dispute as a group of prominent authors, including Jonathan Franzen, John Grisham, George R.R. Martin, Jodi Picoult, and several more filed a complaint against the company. The authors, represented by the Authors Guild, accused OpenAI of copyright infringement under the Copyright Act. The issue concerns OpenAI’s alleged use of the authors’ works to train OpenAI’s large language models (“LLMs”), such as the one used by ChatGPT. This blog explores the details of the case, the implications of generative AI, and the legal claims made by the authors.
Generative AI is a broad term that refers to systems that learn patterns based on input data and which can create new content in response to user inputs, or prompts based on the input data. LLMs, on the other hand, are a type of generative AI in the form of an algorithm which is designed to generate human-like text responses based on patterns they learn from large datasets. These models can produce anything from simple sentences to entire novels and can even mimic the writing style of specific authors. The quality of the LLM depends on the quality of the data it is trained on.
The crux of the authors’ complaint lies in OpenAI’s admission that they used copyrighted works in training their LLMs, despite alternatives like training using content available through public domain. OpenAI argued that refraining from using copyrighted works would significantly reduce the quality of their models. This admission may raise questions about whether OpenAI knowingly trained their LLMs on copyrighted material without obtaining proper permissions.
The plaintiffs argue that OpenAI’s LLMs not only reproduced their copyrighted works while training but also can generate derivative content when prompted. This practice potentially threatens authors’ livelihoods as businesses emerge to sell prompts for users to create new content in the style of these authors. Some writers have reported substantial income losses as clients opt for AI-generated content over human-written work. An Open Letter by the Authors Guild, signed by nearly 12,000 writers, underscores these concerns about AI’s impact on the profession.
The complaint includes specific claims from authors such as David Baldacci and Elizabeth Boyle. For instance, Baldacci alleges that OpenAI used his works without permission to train their models, evident when ChatGPT generated summaries and outlines of his books. Similar claims are made by other authors, all pointing to the unauthorized use of their copyrighted material.
The lawsuit asserts three counts of copyright infringement against OpenAI:
The lawsuit against OpenAI by prominent authors raises significant questions about the use of copyrighted material in training large language models and the potential consequences for content creators. As the case unfolds, it will shed light on the boundaries of copyright law in the age of artificial intelligence and may have far-reaching implications for the use of generative AI technologies and for the publishing industry. Authors and creators are closely watching this legal battle develop, hoping to safeguard their rights and livelihoods in an evolving technological landscape.
If you are an author or publisher and are concerned about your copyright protection, contact us today.
Contributions to this blog by Meghan Yarussi.