A complaint filed in a Northern California court this alleges that Anthropic is using the site’s data to train AI models without a formal licensing agreement.
Reddit is suing for this violation.
The complaint filed by Reddit alleges that Anthropic’s illicit use of the site’s data for commercial purposes was unlawful, and it also alleges that the AI startup violated Reddit’s user agreement.
Reddit’s lawsuit marks the first time a major technology company has taken legal action against an AI model provider for its training data practices.
This action joins a long list of publishers who have sued technology companies on similar grounds.
The New York Times has initiated legal proceedings against Microsoft and OpenAI for conducting training sessions on its news articles without compensation or authorization.
In the interim, Sarah Silverman and other authors of books have filed a lawsuit against Meta for training AI models on their works without their consent.
Similar allegations have been made by music publishers and artists against AI audio, video, and image generation firms, alleging the misappropriation of their content.
“We will not permit profit-seeking entities such as Anthropic to commercially exploit Reddit content for billions of dollars without any return for redditors or respect for their privacy,” stated Ben Lee, Reddit’s chief legal officer, in a statement submitted to TechCrunch.
It is worth noting that Reddit has entered into agreements with other AI model providers, such as OpenAI and Google, that enable these companies to train AI models on Reddit’s data and incorporate the site’s posts into their respective AI chatbots’ responses.
Nevertheless, Reddit asserts in the filing that it is subject to specific provisions that safeguard the privacy and interests of its users, including OpenAI and Google.
Sam Altman, the CEO of OpenAI, is the third-largest shareholder in Reddit, with an 8.7% stake.
He was previously a member of the company’s board of directors.
In the filing, Reddit asserts that it contacted Anthropic and explicitly stated that the AI startup lacked the necessary authorization to extract or utilize Reddit’s Content.
Nevertheless, Reddit asserts that Anthropic “refused to engage.”
In an email to TechCrunch, Anthropic spokesperson Danielle Ghighlieri stated, “We are in disagreement with Reddit’s assertions and will vigorously defend ourselves.”
In its complaint, Reddit asserts that Anthropic’s scraper algorithms disregarded the social network’s robots.txt files, a standard that instructs automated systems not to crawl websites.
Anthropic’s algorithms continued to scavenge the platform more than 100,000 times after Anthropic claimed to block them from scraping Reddit in 2024, according to the online community platform’s allegations.
Reddit is requesting that Anthropic pay compensatory damages and restitution for the amount by which Anthropic has been enriched by harvesting Reddit’s content.
Reddit also seeks an injunction that would prevent Anthropic from utilizing Reddit’s content in the future.