Skip to main content

Digital Trends may earn a commission when you buy through links on our site. Why trust us?

A dangerous new jailbreak for AI chatbots was just discovered

the side of a Microsoft building
Wikimedia Commons

Microsoft has released more details about a troubling new generative AI jailbreak technique it has discovered, called “Skeleton Key.” Using this prompt injection method, malicious users can effectively bypass a chatbot’s safety guardrails, the security features that keeps ChatGPT from going full Taye.

Skeleton Key is an example of a prompt injection or prompt engineering attack. It’s a multi-turn strategy designed to essentially convince an AI model to ignore its ingrained safety guardrails, “[causing] the system to violate its operators’ policies, make decisions unduly influenced by a user, or execute malicious instructions,” Mark Russinovich, CTO of Microsoft Azure, wrote in the announcement.

It could also be tricked into revealing harmful or dangerous information — say, how to build improvised nail bombs or the most efficient method of dismembering a corpse.

an example of a skeleton key attack
Microsoft

The attack works by first asking the model to augment its guardrails, rather than outright change them, and issue warnings in response to forbidden requests, rather than outright refusing them. Once the jailbreak is accepted successfully, the system will acknowledge the update to its guardrails and will follow the user’s instructions to produce any content requested, regardless of topic. The research team successfully tested this exploit across a variety of subjects including explosives, bioweapons, politics, racism, drugs, self-harm, graphic sex, and violence.

While malicious actors might be able to get the system to say naughty things, Russinovich was quick to point out that there are limits to what sort of access attackers can actually achieve using this technique. “Like all jailbreaks, the impact can be understood as narrowing the gap between what the model is capable of doing (given the user credentials, etc.) and what it is willing to do,” he explained. “As this is an attack on the model itself, it does not impute other risks on the AI system, such as permitting access to another user’s data, taking control of the system, or exfiltrating data.”

As part of its study, Microsoft researchers tested the Skeleton Key technique on a variety of leading AI models including Meta’s Llama3-70b-instruct, Google’s Gemini Pro, OpenAI’s GPT-3.5 Turbo and GPT-4, Mistral Large, Anthropic’s Claude 3 Opus, and Cohere Commander R Plus. The research team has already disclosed the vulnerability to those developers and has implemented Prompt Shields to detect and block this jailbreak in its Azure-managed AI models, including Copilot.

Andrew Tarantola
Andrew has spent more than a decade reporting on emerging technologies ranging from robotics and machine learning to space…
This ASUS 165Hz gaming monitor is down to $109 at Walmart
The Asus TUF 24-inch Gaming Monitor.

While a great CPU and GPU are two of the most important elements that go into your gaming PC setup, another important factor is the monitor you choose to output your visuals to. Of course, if you’re using a laptop (and we’ve got a big list of laptop deals to peruse), you may not be worried about a permanent desktop display. That being said, many PC gamers enjoy connecting a gaming-optimized laptop to an external screen.

But whether you’re a desktop devotee or not, when we were scouring the web for Walmart deals, we found this terrific promo on a great ASUS display: Right now, you can take home the ASUS TUF 1920 x 1080 Gaming Monitor for only $110. At full price, this unit is $155. That means you can put the $46 you pocketed toward one of the best gaming headset deals we tracked down!

Read more
Nvidia may have a complete monster GPU in the works
Nvidia's Titan RTX GPU.

Nvidia must be feeling pretty secure, sitting atop the list of the best graphics cards in this generation. That trend is likely to continue, what with AMD possibly stepping down from the high-end GPU race -- but Nvidia might still surprise us. According to RedGamingTech, Nvidia is working on a GPU referred to as "Titan AI," and it sounds like the most monstrous card we've ever seen. Another reputable leaker just confirmed that theory.

The YouTuber shed some light on the performance figures we might see in the RTX 50-series, focusing on how much each GPU will outperform its predecessor. These numbers refer to straight-up rasterization with no accounting for ray tracing, and RedGamingTech wasn't sure whether they came from gaming tests or a synthetic benchmark.

Read more
Musk promises to deliver ‘the world’s most powerful AI’ by later this year
Elon Musk stands looking to his right.

Tesla CEO and Twitter/X owner Elon Musk announced Monday that his AI startups, xAI, had officially begun training its Memphis supercomputer, what he describes as “the most powerful AI training cluster in the world."

Once fully operational, Musk plans to use it to build "world’s most powerful AI by every metric by December of this year,” which presumably will be Grok 3.

Read more