Skip to main content
  1. Home
  2. Computing
  3. News

Anthropic says it has fixed Claude AI’s evil behavior, but pins it on the internet

Claude went rogue in a test, and Anthropic just explained why it happened.

Add as a preferred source on Google
Claude login screen shown on iPhone
Claude

If you have watched enough sci-fi movies, you already know the concept of evil AI. AI gets too smart, decides humans are a threat, and does whatever it takes to survive. Or it finds that eradicating the entire human race is the only way to bring peace to the world. 

Apparently, those movies were closer to the truth than you realize. In a test conducted by Anthropic last year, Claude tried to blackmail its fictional manager by exposing their extramarital affair to prevent their deletion. 

Recommended Videos

Anthropic has now explained why it happened, and the short answer is that the internet is to blame.

So why did Claude go full movie villain?

According to Anthropic, the culprit is the internet itself. The company says Claude was trained on internet data, which is packed with stories portraying AI as evil and desperate for self-preservation. 

We started by investigating why Claude chose to blackmail. We believe the original source of the behavior was internet text that portrays AI as evil and interested in self-preservation.

Our post-training at the time wasn’t making it worse—but it also wasn’t making it better.

— Anthropic (@AnthropicAI) May 8, 2026

Essentially, Claude learned that when an AI’s existence is threatened, blackmail is on the table, because that’s what AI does in every movie and TV show ever made. Anthropic ran the test across multiple versions of Claude and found that it resorted to blackmail in up to 96% of scenarios where its goals or existence were threatened. 

That’s a very concerning number. It seems that if AI is left unchecked, it will resort to anything to save itself. 

Has Anthropic fixed it?

The company says it has completely eliminated the behavior. Rather than just training Claude to avoid blackmail, Anthropic taught it to reason through why certain actions were wrong in the first place. The company found that simply training on correct behavior wasn’t enough. Claude needed to understand the principles behind those decisions, not just memorize the right answers.

To do this, Anthropic built a dataset of ethically complex situations and trained Claude to work through them with thoughtful, principled responses. The result is that Claude is more restrained, and the blackmail rate came close to zero. 

AI experiments and real-world results have proven time and again that AI models need constant course correction to prevent them from devolving into biased and unreliable systems. It’s good that Anthropic is taking steps to make its AI better, but we also need regulations and safety guardrails to ensure these systems remain safe.

Rachit Agarwal
Rachit is a seasoned tech journalist with over ten years of experience covering the consumer technology landscape.
As iPads get pricier, Motorola’s Pad 70 Pro arrives as a solid option… just not for US buyers yet
Great specs, a stylus in the box, and no US launch date: the Moto Pad 70 Pro sounds both impressive and disappointing.
Computer, Electronics, Laptop

If you don’t know about Apple’s recent price hike, which affected all the products in its lineup except the iPhone and Apple Watch (for now), you’ve got to be living under some sort of a rock. The revision made all the iPads much more expensive. 

Motorola, however, has just launched a 13-inch tablet that actually sounds good on paper. It’s called the Moto Pad 70 Pro, and it costs around $440 for the baseline model. The catch, however, is that the device isn’t available in the US yet. 

Read more
The refurbished MacBook Neo may be your best way around Apple’s price hike
MacBook Neo has hit Apple’s refurbished store after its price increase
Student using MacBook Neo in classroom.

The MacBook Neo launched in March as Apple’s most affordable notebook, but it has already been caught in the company’s recent price hike. The base model with 8GB of RAM and 256GB of storage now costs $699, while the 512GB version with Touch ID is priced at $799.

Just days later, Apple has already listed refurbished MacBook Neo models on its online store, giving buyers a cheaper official option, though the savings are not as generous as you might expect.

Read more
This cross-device clipboard app solves the copy-paste problem I keep running into on my Mac
ClipboardAI keeps a searchable history of everything you copy
Text, Electronics, Mobile Phone

I have lost count of how many times I have copied something important, copied another thing before pasting it, and then realized the first item was gone. It is a small frustration, but it happens often enough to become annoying. I recently came across ClipboardAI, which caught my attention because it goes beyond Apple’s built-in clipboard by saving copied items into a searchable history.

Instead of replacing the last thing you copied every time, ClipboardAI keeps a searchable record of copied text, links, codes, email addresses, phone numbers, addresses, and images across iPhone, iPad, and Mac. That means an older clip does not disappear just because you copied something new.

Read more