Skip to main content

Like a vice principal in the sky, this A.I. spots fights before they happen

Eye in the Sky: Real-time Drone Surveillance System (DSS) for Violent Individuals Identification

We live in a surveillance society: A U.S. citizen is reportedly captured on CCTV around 75 times per day. And that figure is even higher elsewhere in the world. Your average Brit is likely to be caught on surveillance cameras up to 300 times in the same period.

But a lot of existing CCTV networks still rely on people to operate them. Depending on the circumstances, there might be a human being at the other end, watching multiple camera feeds on a bank of monitors. Alternatively, there may be no-one watching at all, with the footage only ever viewed in the event that it needs to be.

Two cutting edge technologies may shake up surveillance as we know it, however. Researchers from the U.K.’s University of Cambridge and India’s National Institute of Technology and Institute of Science, Bangalore have published a new paper describing a drone-based surveillance system that uses UAVs as flying security cameras to keep an eye (or several) on large gatherings of people.

“Our system is able to identify the violent individuals real-time.”

What’s more, these drones are equipped with deep learning artificial intelligence algorithms that allow them to identify troublemakers in crowds and take the proper precautions.

The “Eye in the Sky” real-time drone surveillance system could be deployed at events like music festivals, marathons or other large gatherings, where it would be utilized to identify violent individuals — based on their aggressive posture — using the latest pattern recognition technology. It then promises to alert the authorities. Welcome to the future of surveillance!

Identifying attackers in real time

“Our system is able to identify the violent individuals real-time,” Amarjot Singh, one of the researchers on the project, told Digital Trends. “This was quite challenging, and we had to develop unique ways to achieve [it]. The problem was that the standard deep learning algorithms require tens of thousands of annotated images to train these systems. This would normally be fine, if one was to assign just one label to an image — for example, assign ‘cat’ to an image of a cat. But in order to design a system which can detect the human pose from aerial images, the system needs to be trained with aerial images annotated with 14 key points on the human body. Since there is no dataset available for this type of application, we ourself annotated the points on the human body, which was extremely time-consuming and expensive.”

The ScatterNet Hybrid Deep Learning neural network analyzing body language via a complex motion-tracking algorithm inside a drone. University of Cambridge/National Institute of Technology/Indian Institute of Science

By breaking the human body down into 14 different points, the system is able to work out which violent action — if any — is being performed with an accuracy of around 85 percent (and all the way up to 94.1 percent) depending on how many people are being surveilled.

The algorithm was trained by analyzing 2,000 annotated images gathered from a group of 25 participants, who were asked to act out violent attacks such as punching, kicking, strangling, shooting, and stabbing. Accuracy levels drop the more people are being monitored and the further away the drone is, although this could be improved in the future. The finished algorithm, called the ScatterNet Hybrid Deep Learning neural network, can learn using a relatively small number of training examples, while robustly extracting the body’s posture with consistent accuracy.

“Once, the system can do well in the test runs, we will be bringing it to market.”

Just as important, it can do it very, very quickly — within just a few frames of video. This is especially important for security applications, since any potential system designed for this purpose needs to be able to alert authorities of an escalating situation before it has erupted into violence.

“The system detects the violent individuals by first extracting and sending the aerial frame recorded by the drone to the Amazon cloud,” Singh continued. “The human detection algorithm detects each human in the image frame. Next, the pose is estimated for each individual. The pose of the individuals involved in the violent activity are jointly analyzed to identify the violent individual.”

But all of this is still quite far away, right? Not necessarily. “We will be flying the drone at the technical festival at NIT Warangal in Andra Pradesh in India, in October of this year,” Singh said. “The second author is from NIT Warangal, and I am an alumnus. The festival is attended by around 3,000 – 4,000 people and is extremely packed.” The pair also hoped to use the drone at at another event in India called “spring spree.”

A figure from the research paper showcasing how the A.I.-equipped drone processes body language by looking at visual cues based on specific points around the body. University of Cambridge/National Institute of Technology/Indian Institute of Science

It won’t be used for alerting the police to violent action at these events, but rather to prove that the technology can accurately predict the outbreak of violence. “Once, the system can do well in the test runs, we will be bringing it to market,” Singh said. “We are also planning to extend this system to monitor the borders of India.”

Possible ethical concerns?

Technology like this will, of course, prompt polarizing responses. Police in the U.S. are already using drones on a regular basis — and this will only ramp up in the years to come — although right now that it done primarily for tasks like assessing crime scenes.

Like predictive policing, A.I.-equipped surveillance drones carry the ominous suggestion of possible “pre-crime.”

The idea of using A.I.-equipped drones for long-term surveillance of crowds runs the risk of verging on the Orwellian. Like predictive policing, it additionally carries the ominous suggestion of possible “pre-crime.” While this particular project focuses on actions carried out, there are examples of smart CCTV cameras with the goal of intervening before an act is carried out.

In the U.K., for instance, an A.I. algorithm is used for spotting potential jumpers on the London Underground train service. This system works by analyzing the behavior of people waiting on the platform, and then looking for those who miss several available trains during that time. This is because such actions have been shown to precede suicide attempts, thereby allowing intervention to be made.

Researcher Amarjot Singh
Researcher Amarjot Singh Image used with permission by copyright holder

It’s one thing to use A.I. more rapidly to stop a scuffle which has broken out; perhaps another to surveil someone on the basis of body language suggesting that they might do something. Such questions will need to be explored as technology such as this becomes mainstream. If it is able to avoid violent acts on the street, we may well consider the tradeoff to be worth it.

Ultimately, concepts such as this make us recall the dream of philosopher Jeremy Bentham’s Panopticon. This was a proposed prison idea in which the presence of a central guard tower makes the prisoners believe that they are being watched at all times. The result, writers like Michel Foucault have suggested, is that prisoners wind up behaving as though they are being watched at all times.

As drones increasingly take to the skies, the existence of tools like this could prompt us to act in a similar way. Is that a police drone or an Amazon delivery flying overhead? You’d better straighten up your posture and loosen your shoulders just in case!

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
How Nvidia is using A.I. to help Domino’s deliver pizzas faster
Domino's delivery in line.

Nvidia announced a new tool that can help deliver your pizzas faster -- yes, really -- at its fall GTC 2021 event. It's called ReOpt, and it's a real-time logistics tool that Domino's is already using to optimize delivery routes based on time and cost.

ReOpt is a set of logistics-planning algorithms that can find billions of routes to the same location. It utilizes heuristics powered by GPU computing to route vehicles in the most efficient way possible. It's like Google Maps, just way more complex and designed specifically to meet the needs of last-mile delivery.

Read more
The funny formula: Why machine-generated humor is the holy grail of A.I.
microphone in a bar

In "The Outrageous Okona," the fourth episode of the second season of Star Trek: The Next Generation, the Enterprise's resident android Data attempts to learn the one skill it has previously been unable to master: Humor. Visiting the ship’s Holodeck, Data takes lessons from a holographic comedian to try and understand the business of making funny.

While the worlds of Star Trek and the real world can be far apart at times, this plotline rings true for machine intelligence here on Earth. Put simply, getting an A.I. to understand humor and then to generate its own jokes turns out to be extraordinarily tough.

Read more
Nvidia’s latest A.I. results prove that ARM is ready for the data center
Jensen Huang at GTX 2020.

Nvidia just published its latest MLPerf benchmark results, and they have are some big implications for the future of computing. In addition to maintaining a lead over other A.I. hardware -- which Nvidia has claimed for the last three batches of results -- the company showcased the power of ARM-based systems in the data center, with results nearly matching traditional x86 systems.

In the six tests MLPerf includes, ARM-based systems came within a few percentage points of x86 systems, with both using Nvidia A100 A.I. graphics cards. In one of the tests, the ARM-based system actually beat the x86 one, showcasing the advancements made in deploying different instruction sets in A.I. applications.

Read more