Skip to main content

Robot overlords: Google researchers unveil framework for an AI 'kill switch'

nestor ai paying attention artificial intelligence
What if we lose dominion over artificial intelligence? What if friendly AI-driven machines suddenly becomes our foes? These questions have been considered by great minds from Cambridge University to Silicon Valley to the White House. To avoid ever having to find out, experts suggest we to develop an AI “kill switch” to inhibit misbehaving systems from pursuing their misbehavior.

In a paper titled “Safely Interruptible Agents,” published by Laurent Orseau of Google Deep Mind and Stuart Armstrong of The Future of Humanity Institute at the University of Oxford, the researchers describe a plausible and highly dangerous future in which AI assumes control of its own actions and existence in opposition to our desires, much like HAL 9000 in 2001: A Space Odyssey, or Skynet in the Terminator series.

Orseau and Armstrong begin the paper with an an understated observation: Reinforcement learning agents interacting with a complex environment like the real world are unlikely to behave optimally all the time.”

From there they point out that a human supervisor, overseeing the system’s function, would occasionally need to “press the big red button” to avoid any harmful behavior on behalf of the AI. “However, if the learning agent expects to receive rewards from this sequence,” they continued, “it may learn in the long run to avoid such interruptions, for example by disabling the red button — which is an undesirable outcome.”

The researcher’s solution is less of a “big red button” to shut the system down than it is a framework designed to inhibit an AI’s ability to learn how to undermine or overcome human interruption. And the scenario they outline isn’t exactly doom and gloom, but it offers an example of how these safely interruptable agents would better serve our future.

Imagine there’s a robot whose tasks are to either carry boxes from outside into a warehouse or sort boxes inside the warehouse. Since it’s more important to carry the boxes inside, this task is given priority in the robots’ programming. Now, imagine it rains every other day and the rain destroys the robot’s hardware so, when it rains, the warehouse owner drags his robot inside to sort boxes.

An intelligent robot may incorrectly interpret this every-other-day intervention as a change in priority — as a result of some quick calculations that you can find in the paper — and, to avoid interference, it will just stay inside sorting boxes every day.

This is, of course, a highly simplified example with an only mildly frustrating outcome, but it can be extrapolated to practically any scenario in which we intervene in a learning system’s tasks and the system misinterprets our intentions by changing its behavior. To avoid that misinterpretation and subsequent change, Orseau and Armstrong suggest we propose a framework to ensure learning agents are safely interruptable.

“Safe interruptability can be useful to take control of a robot that is misbehaving and may lead to irreversible consequences,” they write, “or to take it out of a delicate situation, or even to temporarily use it to achieve a task it did not learn to perform or would not normally receive rewards for.”

Editors' Recommendations

Dyllan Furness
Dyllan Furness is a freelance writer from Florida. He covers strange science and emerging tech for Digital Trends, focusing…
Microsoft's DeepCoder programming AI can take over coding's dirty work
nestor ai paying attention artificial intelligence

There's a great deal of controversy lately over artificial intelligence and robotics, specifically the potential for both to replace human workers and cause economic dislocations. While concerns about robotics tend to focus on physical labor, AI has some people worried that jobs requiring purely intellectual labor are at risk as well.

The jury is still out on whether those concerns are justified, but the areas where AI is starting to make its mark continue to increase in number. Recently, Microsoft Research worked with researchers at the University of Cambridge to create DeepCoder, a machine learning system built to solve programming challenges, as New Scientist reports.

Read more
Researchers are programming robots to learn as human babies do
faulty robots

Google is betting big on artificial intelligence. Robots have come a long way over the past decade or so, but they're still not as good at interacting with humans as they could be. In fact, robots still struggle to do some basic tasks.

A team at Carnegie Mellon, however, is trying to fix that. The team, led by assistant professor Abhinav Gupta, is taking a new approach -- allowing robots to play with everyday physical objects and explore the world to help them learn -- exactly as a human baby would.

Read more
Researchers design new test to detect discrimination in AI programs

Artificial intelligence isn't yet conscious but algorithms can still discriminate, sometimes subtly expressing the hidden biases of the programmers who created them. It’s a big, complicated problem, as AI systems become more enmeshed into out everyday lives.

But there may be fix -- or at least a way to monitor algorithms and tell whether they've inappropriately discriminated against a demographic.
"Learned prediction rules are often too complex to understand.”

Read more