Skip to main content

Robot overlords: Google researchers unveil framework for an AI 'kill switch'

nestor ai paying attention artificial intelligence
Image used with permission by copyright holder
What if we lose dominion over artificial intelligence? What if friendly AI-driven machines suddenly becomes our foes? These questions have been considered by great minds from Cambridge University to Silicon Valley to the White House. To avoid ever having to find out, experts suggest we to develop an AI “kill switch” to inhibit misbehaving systems from pursuing their misbehavior.

In a paper titled “Safely Interruptible Agents,” published by Laurent Orseau of Google Deep Mind and Stuart Armstrong of The Future of Humanity Institute at the University of Oxford, the researchers describe a plausible and highly dangerous future in which AI assumes control of its own actions and existence in opposition to our desires, much like HAL 9000 in 2001: A Space Odyssey, or Skynet in the Terminator series.

Orseau and Armstrong begin the paper with an an understated observation: Reinforcement learning agents interacting with a complex environment like the real world are unlikely to behave optimally all the time.”

From there they point out that a human supervisor, overseeing the system’s function, would occasionally need to “press the big red button” to avoid any harmful behavior on behalf of the AI. “However, if the learning agent expects to receive rewards from this sequence,” they continued, “it may learn in the long run to avoid such interruptions, for example by disabling the red button — which is an undesirable outcome.”

The researcher’s solution is less of a “big red button” to shut the system down than it is a framework designed to inhibit an AI’s ability to learn how to undermine or overcome human interruption. And the scenario they outline isn’t exactly doom and gloom, but it offers an example of how these safely interruptable agents would better serve our future.

Imagine there’s a robot whose tasks are to either carry boxes from outside into a warehouse or sort boxes inside the warehouse. Since it’s more important to carry the boxes inside, this task is given priority in the robots’ programming. Now, imagine it rains every other day and the rain destroys the robot’s hardware so, when it rains, the warehouse owner drags his robot inside to sort boxes.

An intelligent robot may incorrectly interpret this every-other-day intervention as a change in priority — as a result of some quick calculations that you can find in the paper — and, to avoid interference, it will just stay inside sorting boxes every day.

This is, of course, a highly simplified example with an only mildly frustrating outcome, but it can be extrapolated to practically any scenario in which we intervene in a learning system’s tasks and the system misinterprets our intentions by changing its behavior. To avoid that misinterpretation and subsequent change, Orseau and Armstrong suggest we propose a framework to ensure learning agents are safely interruptable.

“Safe interruptability can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences,” they write, “or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not normally receive rewards for.”

Editors' Recommendations

Dyllan Furness
Dyllan Furness is a freelance writer from Florida. He covers strange science and emerging tech for Digital Trends, focusing…
Digital Trends’ Tech For Change CES 2023 Awards
Digital Trends CES 2023 Tech For Change Award Winners Feature

CES is more than just a neon-drenched show-and-tell session for the world’s biggest tech manufacturers. More and more, it’s also a place where companies showcase innovations that could truly make the world a better place — and at CES 2023, this type of tech was on full display. We saw everything from accessibility-minded PS5 controllers to pedal-powered smart desks. But of all the amazing innovations on display this year, these three impressed us the most:

Samsung's Relumino Mode
Across the globe, roughly 300 million people suffer from moderate to severe vision loss, and generally speaking, most TVs don’t take that into account. So in an effort to make television more accessible and enjoyable for those millions of people suffering from impaired vision, Samsung is adding a new picture mode to many of its new TVs.
[CES 2023] Relumino Mode: Innovation for every need | Samsung
Relumino Mode, as it’s called, works by adding a bunch of different visual filters to the picture simultaneously. Outlines of people and objects on screen are highlighted, the contrast and brightness of the overall picture are cranked up, and extra sharpness is applied to everything. The resulting video would likely look strange to people with normal vision, but for folks with low vision, it should look clearer and closer to "normal" than it otherwise would.
Excitingly, since Relumino Mode is ultimately just a clever software trick, this technology could theoretically be pushed out via a software update and installed on millions of existing Samsung TVs -- not just new and recently purchased ones.

Read more
AI turned Breaking Bad into an anime — and it’s terrifying
Split image of Breaking Bad anime characters.

These days, it seems like there's nothing AI programs can't do. Thanks to advancements in artificial intelligence, deepfakes have done digital "face-offs" with Hollywood celebrities in films and TV shows, VFX artists can de-age actors almost instantly, and ChatGPT has learned how to write big-budget screenplays in the blink of an eye. Pretty soon, AI will probably decide who wins at the Oscars.

Within the past year, AI has also been used to generate beautiful works of art in seconds, creating a viral new trend and causing a boon for fan artists everywhere. TikTok user @cyborgism recently broke the internet by posting a clip featuring many AI-generated pictures of Breaking Bad. The theme here is that the characters are depicted as anime characters straight out of the 1980s, and the result is concerning to say the least. Depending on your viewpoint, Breaking Bad AI (my unofficial name for it) shows how technology can either threaten the integrity of original works of art or nurture artistic expression.
What if AI created Breaking Bad as a 1980s anime?
Playing over Metro Boomin's rap remix of the famous "I am the one who knocks" monologue, the video features images of the cast that range from shockingly realistic to full-on exaggerated. The clip currently has over 65,000 likes on TikTok alone, and many other users have shared their thoughts on the art. One user wrote, "Regardless of the repercussions on the entertainment industry, I can't wait for AI to be advanced enough to animate the whole show like this."

Read more
4 simple pieces of tech that helped me run my first marathon
Garmin Forerunner 955 Solar displaying pace information.

The fitness world is littered with opportunities to buy tech aimed at enhancing your physical performance. No matter your sport of choice or personal goals, there's a deep rabbit hole you can go down. It'll cost plenty of money, but the gains can be marginal -- and can honestly just be a distraction from what you should actually be focused on. Running is certainly susceptible to this.

A few months ago, I ran my first-ever marathon. It was an incredible accomplishment I had no idea I'd ever be able to reach, and it's now going to be the first of many I run in my lifetime. And despite my deep-rooted history in tech, and the endless opportunities for being baited into gearing myself up with every last product to help me get through the marathon, I went with a rather simple approach.

Read more