Skip to main content

DeepMind is already figuring out ways to keep us safe from AGI

deepmind logo
DeepMind

Artificial General Intelligence is a huge topic right now — even though no one has agreed what AGI really is. Some scientists think it’s still hundreds of years away and would need tech that we can’t even begin to imagine yet, while Google DeepMind says it could be here by 2030 — and it’s already planning safety measures.

It’s not uncommon for the science community to disagree on topics like this, and it’s good to have all of our bases covered with people planning for both the immediate future and the distant future. Still, five years is a pretty shocking number.

Recommended Videos

Right now, the “frontier AI” projects known to the public are all LLMs — fancy little word guessers and image generators. ChatGPT, for example, is still terrible at math, and every model I’ve ever tried is awful at listening to instructions and editing their responses accurately. Anthropic’s Claude still hasn’t beaten Pokémon and as impressive as the language skills of these models are, they’re still trained on all the worst writers in the world and have picked up plenty of bad habits.

It’s hard to imagine jumping from what we have now to something that, in DeepMind’s words, displays capabilities that match or exceed “that of the 99th percentile of skilled adults.” In other words, DeepMind thinks that AGI will be as smart or smarter than the top 1% of humans in the world.

So, what kind of risks does DeepMind think an Einstein-level AGI could pose?

According to the paper, we have four main categories: misuse, misalignment, mistakes, and structural risks. They were so close to four Ms, that’s a shame.

DeepMind considers “misuse” to be things like influencing political races with deepfake videos or impersonating people during scams. It mentions in the conclusion that its approach to safety “centers around blocking malicious actors’ access to dangerous capabilities.”

That sounds great, but DeepMind is a part of Google and the U.S. tech giant is developing these systems itself. Sure, Google likely won’t try to steal money from elderly people by impersonating their grandchildren – but that doesn’t mean it can’t use AGI to bring itself profit while ignoring consumers’ best interests.

It looks like “misalignment” is the Terminator situation, where we ask the AI for one thing and it just does something completely different. That one is a little bit uncomfortable to think about. DeepMind says the best way to counter this is to make sure we understand how our AI systems work in as much detail as possible, so we can tell when something is going wrong, where it’s going wrong, and how to fix it.

This goes against the whole “spontaneous emergence” of capabilities and the concept that AGI will be so complex that we won’t know how it works. Instead, if we want to stay safe, we need to make sure we do know what’s going on. I don’t know how hard that will be but it definitely makes sense to try.

The last two categories refer to accidental harm — either mistakes on the AI’s part or things just getting messy when too many people are involved. For this, we need to make sure we have systems in place that approve the actions an AGI wants to take and prevent different people from pulling it in opposite directions.

While DeepMind’s paper is completely exploratory, it seems there are already plenty of ways we can imagine AGI going wrong. This isn’t as bad as it sounds — the problems we can imagine are the problems we can best prepare for. It’s the problems we don’t anticipate that are scarier, so let’s hope we’re not missing anything big.

Willow Roberts
Willow Roberts has been a Computing Writer at Digital Trends for a year and has been writing for about a decade. She has a…
This Lenovo ThinkPad is usually $2,059 — today it’s under $1,000
The Lenovo ThinkPad L13 Yoga 2-in-1 laptop in tablet mode.

You can enjoy the best of both worlds between laptop deals and tablet deals if you go for a 2-in-1 laptop like the Lenovo ThinkPad L13 Yoga Gen 4, which is currently on sale from Lenovo itself at 54% off. Its estimated value of $2,059 may seem a bit too high, but in any case, it's a smart purchase at its discounted price of just $931. You'll have to be quick in finishing the purchase process for this device though, as it may be back to its regular price as soon as tomorrow.

Why you should buy the Lenovo ThinkPad L13 Yoga Gen 4 2-in-1 laptop

Read more
‘You can’t lick a badger twice’: How Google’s AI Overview hallucinates idioms
Samples of Google AI Overview errors.

The latest AI trend is a funny one, as a user has discovered that you can plug a made-up phrase into Google and append it with "meaning," then Google's AI Overview feature will hallucinate a meaning for the phrase.

Historian Greg Jenner kicked off the trend with a post on Bluesky in which he asked Google to explain the meaning of "You can't lick a badger twice." AI Overview helpfully explained that this expression means that you can't deceive someone a second time after they've already been tricked once -- which seems like a reasonable explanation, but ignores the fact that this idiom didn't exist before this query went viral.

Read more
You can now try Adobe’s new app to digitally sign your artwork for free
Adobe Content Authenticity app graphic.

First announced in October, Adobe's Content Authenticity app is now in public beta, and anyone can try it for free. The app allows people to add "Content Credentials" to their digital work -- invisible and secure metadata that shares information about the creator. AI can't edit it out like a watermark and it still works even when someone screenshots the original file.

You can add various information to your Content Credentials, such as your name (which can be verified via LinkedIn) and your social media accounts. You can also express your preferences toward generative AI training. This is an experimental feature aiming to get a headstart on future AI regulation that Adobe hopes will respect the creator's choice regarding training data.

Read more