Generative AI and Large Language Models

Generative AI is a broad term that refers to systems that learn patterns based on input data and which can create new content in response to user inputs, or prompts based on the input data. LLMs, on the other hand, are a type of generative AI in the form of an algorithm which is designed to generate human-like text responses based on patterns they learn from large datasets. These models can produce anything from simple sentences to entire novels and can even mimic the writing style of specific authors. The quality of the LLM depends on the quality of the data it is trained on.

OpenAI’s Use of Copyrighted Works

The crux of the authors’ complaint lies in OpenAI’s admission that they used copyrighted works in training their LLMs, despite alternatives like training using content available through public domain. OpenAI argued that refraining from using copyrighted works would significantly reduce the quality of their models. This admission may raise questions about whether OpenAI knowingly trained their LLMs on copyrighted material without obtaining proper permissions.

Harm to Authors

The plaintiffs argue that OpenAI’s LLMs not only reproduced their copyrighted works while training but also can generate derivative content when prompted. This practice potentially threatens authors’ livelihoods as businesses emerge to sell prompts for users to create new content in the style of these authors. Some writers have reported substantial income losses as clients opt for AI-generated content over human-written work. An Open Letter by the Authors Guild, signed by nearly 12,000 writers, underscores these concerns about AI’s impact on the profession.

Plaintiff-Specific Claims

The complaint includes specific claims from authors such as David Baldacci and Elizabeth Boyle. For instance, Baldacci alleges that OpenAI used his works without permission to train their models, evident when ChatGPT generated summaries and outlines of his books. Similar claims are made by other authors, all pointing to the unauthorized use of their copyrighted material.

Claims for Relief

The lawsuit asserts three counts of copyright infringement against OpenAI:

Direct Copyright Infringement: The plaintiffs claim that OpenAI knowingly reproduced their copyrighted works to train LLMs, constituting direct copyright infringement.

Vicarious Copyright Infringement: OpenAI Inc. and OpenAI GP LLC are alleged to have control over the direct infringement and to have benefitted financially from it, making them vicariously liable for copyright infringement.

Contributory Copyright Infringement: The plaintiffs argue that OpenAI Inc. and OpenAI GP LLC materially contributed to and directly assisted in the direct infringement by providing funding, technology, personnel, resources, and guidance. These defendants are aware of the direct infringement and share management personnel and operational plans with OpenAI LP, making them contributorily liable for the direct infringement.

Conclusion

The lawsuit against OpenAI by prominent authors raises significant questions about the use of copyrighted material in training large language models and the potential consequences for content creators. As the case unfolds, it will shed light on the boundaries of copyright law in the age of artificial intelligence and may have far-reaching implications for the use of generative AI technologies and for the publishing industry. Authors and creators are closely watching this legal battle develop, hoping to safeguard their rights and livelihoods in an evolving technological landscape.

If you are an author or publisher and are concerned about your copyright protection, contact us today.

Contributions to this blog by Meghan Yarussi.

Condé Nast has initiated a trademark infringement and dilution action against the publisher of Dogue, a dog-fashion magazine, alleging that the name and branding of Dogue unlawfully trade on the goodwill and distinctiveness of Condé Nast’s iconic Vogue trademarks. The lawsuit was filed by Advance Magazine Publishers Inc., Condé Nast’s publishing entity, in the U.S. District Court for the Central District

On January 29th, 2025, the US Copyright Office published Part Two of its Copyright and Artificial Intelligence report, which addresses the copyrightability of works created using generative Artificial Intelligence (AI). Prior to issuing the report, the Copyright Office issued a Notice of Inquiry where they invited comments on AI-related policy issues. The Copyright Office received

In the world of artificial intelligence (AI), the line between innovation and copyright infringement can be a blurry one. Recently, Thomson Reuters scored a major victory in a case that could shape the future of how AI companies use copyrighted content. The case centers on a legal AI startup, Ross Intelligence, which allegedly used Thomson

OpenAI Faces Legal Battle from Authors Over Alleged Copyright Infringement

Generative AI and Large Language Models

OpenAI’s Use of Copyrighted Works

Harm to Authors

Plaintiff-Specific Claims

Claims for Relief

Conclusion

Photo by Scott Graham on Unsplash

Schedule an appointment for a case evaluation

Call us Today

Social Links

Schedule an appointment for a case evaluation