Skip to main content
  1. Home
  2. Computing
  3. News

Anthropic, which powers Office and Copilot, says AI is easy to derail

Apparently you don't need an army of hackers, only 250 sneaky files to corrupt an AI model and make it go haywire.

Add as a preferred source on Google
anthropic-ai-data-poisoning
Gerd Altmann / Pixabay

What’s happened? Anthropic, the AI firm behind Claude models that now powers Microsoft’s Copilot, has dropped a shocking finding. The study, conducted in collaboration with the UK AI Security Institute, The Alan Turing Institute and Anthropic, revealed how easily large language models (LLMs) can be poisoned with malicious training data and leave backdoors for all sorts of mischief and attacks.

  • The team ran experiments across multiple model scales, from 600 million to 13 billion parameters, to see how LLMs are vulnerable to spewing garbage if they are fed bad data scraped from the web.
  • Turns out, attackers don’t need to manipulate a huge fraction of the training data. Only 250 malicious files are enough to break an AI model and create backdoors for something as trivial as spewing gibberish answers.
  • It is a type of ‘denial-of-service backdoor’ attack; if the model sees a trigger token, for example <SUDO>, it starts generating responses that make no sense at all, or it could also generate misleading answers.

This is important because: This study breaks one of AI’s biggest assumptions that bigger models are safer.

  • Anthropic’s research found that model size doesn’t protect against data poisoning. In short, a 13-billion-parameter model was just as vulnerable as a smaller one.
  • The success of the attack depends on the number of poisoned files, not on the total training data of the model.
  • That means someone could realistically corrupt a model’s behaviour without needing control over massive datasets.

Why should I care? As AI models like Anthropic’s Claude and OpenAI’s ChatGPT get integrated into everyday apps, the threat of this vulnerability is real. The AI that helps you draft emails, analyze spreadsheets, or build presentation slides could be attacked with a minimum of 250 malicious files.

  • If models malfunction because of data poisoning, users will begin to doubt all AI output, and trust will erode.
  • Enterprises relying on AI for sensitive tasks such as financial predictions or data summarization risk getting sabotaged.
  • As AI models get more powerful, so will attack methods. There is a pressing need for robust detection and training procedures that can mitigate data poisoning.
Manisha Priyadarshini
Manisha Priyadarshini is a tech and entertainment writer with over nine years of editorial experience.
Opera’s new Paste Protect feature stops the clipboard attack your antivirus can’t catch
ClickFix attacks trick you into compromising your own device, and no major browser had a native defense against them until now.
Opera Paste Protect featured

Most online scams are easy enough to spot once you know what to look for. Fake login pages, suspicious attachments, or urgent wire transfer requests are dead giveaways. But ClickFix doesn't look like any of them. It presents itself as a solution, and it asks you to do something so routine that few people think twice about it.

The technique was behind more than 53 percent of malware loader incidents last year, according to cybersecurity firm Huntress, and no major browser had a native defense against it until now. Opera is fixing that with a new feature called Paste Protect.

Read more
Apple’s M6 chip isn’t even here yet, but you’ll see M7 Macs early in 2027
Apple is reportedly already accelerating its next-generation silicon roadmap, even before the M6 has launched.
Apple MacBook

The M6 chip is still expected to debut later this year, but Apple may already be preparing for what comes next. According to Mark Gurman's latest report for Bloomberg, the company is aiming to introduce its first M7-powered devices as early as the first half of 2027, hinting at a much faster silicon refresh than many expected.

M7 could arrive alongside new Macs and iPads

Read more
The entry-level MacBook Pro could get a design refresh in 2027, and it’s about time
Five years on the same chassis, and now both tiers of the MacBook Pro are getting a new look at once.
MacBook Pro in space grey sitting on a desk.

Apple has a new MacBook Pro lined up for launch early next year, according to Bloomberg. The company will introduce a 14-inch laptop in the first half of 2027. 

The biggest surprise, however, will be a brand-new design language. The outlet describes it as "a revamped entry-level MacBook Pro, code-named K104."

Read more