Skip to main content

Meta faces lawsuit for training AI with pirated books

A silhouetted person holds a smartphone displaying the Facebook logo. They are standing in front of a sign showing the Meta logo.
SOPA Images / Getty Images

In a recent lawsuit, Meta has been accused of using pirated books to train its AI models, with CEO Mark Zuckerberg’s approval. As per Ars Technica, the lawsuit filed by authors including Ta-Nehisi Coates and Sarah Silverman in a California federal court, cite internal Meta communications indicating that the company utilized the Library Genesis (LibGen) dataset—a vast online repository known for hosting pirated books—despite internal concerns about the legality of using such material.

The authors argue that Meta’s actions infringe upon their copyrights and could undermine the company’s position with regulators. They claim that Meta’s AI models, including Llama, were trained using their works without permission, potentially harming their livelihoods. Meta has defended its practices by invoking the “fair use” doctrine, asserting that using publicly available materials to train AI tools is legal in certain cases, such as “using text to statistically model language and generate original expression.”

Recommended Videos

Unsealed court documents from February 5th, 2024, in Kadrey v. Meta show Meta (formerly Facebook) illegally torrented 81.7TB of data from "shadow libraries" such as Anna's Archive, Z-Library, and LibGen to train Meta artificial intelligence.

Highlights include:
– A senior AI… pic.twitter.com/Bqf60Hhbb6

— vx-underground (@vxunderground) February 8, 2025

One internal message highlighted in the lawsuit quotes an employee expressing discomfort, stating, “Torrenting from a corporate laptop doesn’t feel right.”

In response to the lawsuit, U.S. District Judge Vince Chhabria dismissed some claims but allowed the authors to amend their complaint to include new allegations, including those related to the removal of copyright management information. This case is part of a broader wave of legal challenges against tech companies like Meta, OpenAI, and Anthropic, where authors and creators are seeking to protect their intellectual property rights in the face of rapidly advancing AI technologies.

The outcome of this lawsuit could have significant implications for the tech industry, particularly concerning the use of copyrighted materials in AI training. It raises important questions about the balance between technological innovation and the protection of creators’ rights.

Kunal Khullar
Kunal Khullar is a computing writer at Digital Trends who contributes to various topics, including CPUs, GPUs, monitors, and…
Meta’s latest open source AI models challenge GPT, Gemini, and Claude
Meta AI widget on Home Screen.

Meta has announced the latest iteration of its open-source AI model family Llama 4, which the brand has developed while competition in the generative AI industry continues to intensify.

The new AI family includes four models, and Meta detailed Llama 4 Scout, Llama 4 Maverick, and Llama 4 Behemoth. Meta detailed on its AI website that the models were trained on “large amounts of unlabeled text, image, and video data.” This indicates that the models will have varied multimodal capabilities.

Read more
Meta AI glasses leak tips one-eyed screen, Android soul, and high ask
Phil Nickinson wearing the Apple AirPods Pro and Ray-Ban Meta smart glasses.

Meta has tasted some unprecedented success with its Stories smart glasses, created in collaboration with Ray-Ban. The premise of a wearable device with onboard cameras, ready to take social media videos, coupled with an onboard AI assistant, has proved hot enough that Meta has even made high-fashion variants for the upscale market. 

What they have sorely missed so far, is an interactive screen. The next avenue for Meta is apparently putting a display on its fashionable smart glasses and taking their functional appeal to the next level. But that convenience will apparently come at a steep ask. According to Bloomberg, customers are in for a sticker shock worth a thousand dollars at the very least. 

Read more
China joins the global push for AI content regulation
AI chatbots.

Many international entities are pushing for better regulation of AI-generated content on the internet– and China’s government is the latest to reign in the use of the quickly developing technology.

According to Bloomberg, several government ministries have joined with the Chinese internet watchdog Cyberspace Administration of China (CAC) to announce a new mandate that will require internet users to identify any AI-generated content as such in a description or metadata encoding.

Read more