How DeepMind's artificial intelligence will make Google even smarter

Google is ringing in 2014 with a spending spree, first dropping $3.2 billion to acquire Nest Technologies and now spending a reported $400 million (or more) on the UK-based artificial intelligence outfit DeepMind.

It’s no secret that Google has an interest in artificial intelligence; after all, technologies derived from AI research help fuel Google’s core search and advertising businesses. AI also plays a key role in Google’s mobile services, its autonomous cars, and its growing stable of robotics technologies. And with the addition of futurist Ray Kurzweil to its ranks in 2012, Google also has the grandfather of “strong AI” on board, a man who forecasts that intelligent machines may exist by midcentury.

Top-down versus bottom-up AI

In general terms, AI refers to machines doing intellectual tasks at a level comparable to humans. That means reasoning, planning, learning, and using language to communicate at a high level. It also probably includes sensing and interacting with the physical world, although those might not be a requirement, depending on who you ask.

AI research is almost as old as computers, going back to the 1950s. Early efforts (sometimes called symbolic or “top-down” AI) were basically collections of rules. The idea was that with enough explicit rules (like IF person(bieber) IS arrested(drunk driving) THEN respond(LOL!)), systems could make decisions and act autonomously – it was just a question of writing enough rules and waiting for computing hardware powerful enough to handle it all. Top-down AI works well when a defined “knowledge base” can be constructed. For instance, in the 1970s, Stanford’s “Mycin” expert system diagnosed blood-borne infections better than many human internists, and in the 1980s the University of Pittsburgh’s “Caduceus” extended the idea to over 1,000 different diseases. In other words, AI in real life isn’t new.

Long Exposure of a Roomba's path — A Roomba’s path represents an example of bottom-up AI. Image used with permission by copyright holder

But top-down AI can’t cope with stuff outside its rules-and-knowledge sets. Dealing with the unknown – like an autonomous car navigating the constantly changing conditions on the street – requires an inconceivably large number of rules. So researchers developed behavioral or “bottom-up” AI. Instead of writing thousands (or millions or billions) of rules, researchers built systems with simple behaviors (like “move left” or “read the next word”) and showed those systems which actions worked in different contexts – typically by “rewarding” them with points. Some bottom-up AI technologies are based on real-world neuroscience; for instance, neural networks simulate synaptic connections akin to a biological brain. As they’re trained, bottom-up systems develop behaviors – learn – to cope with unforeseen circumstances in ways top-down AI never managed. Real-world technologies developed in part from bottom-up AI include things like the Roomba vacuum, Siri’s speech recognition, and Facebook’s face recognition. Again, AI in the real world.

What is machine learning?

Google’s acquisition of DeepMind is partly about “deep learning,” or ways of teaching bottom-up AI systems about complex concepts. Teaching bottom-up systems means throwing data at them and rewarding correct interpretation or behavior – this is called “supervised” training, because the data is already labelled with the correct answers. Of course, most data in the real world (pictures, video feeds, sounds, etc.) is not labelled – or not labelled well. Very basically, deep learning pre-trains bottom-up AI systems on unlabeled (or semi-labelled) data, leaving the systems free to draw their own conclusions. The pre-trained systems then get feedback on their performance from systems that received supervised training – and they catch on very fast, thanks to their previous experience. Layer these systems on top of each other, and you get programs that can quickly cope with unknown and unlabeled data – just the kind of thing Google deals with by the thousands of gigabytes, twenty-four hours a day, seven days a week. Artificial intelligence researchers with connections to DeepMind have indicated the company’s research has recently produced significant advances in this type of machine learning.

“In my opinion, reinforcement learning and deep learning are not enough to give us ‘thinking machines.’”

Sounds silly? Google’s already been at it for years. In 2012 it constructed a (comparatively small) neural network and showed it images culled from YouTube for a week. What did it learn to recognize without any guidance from humans or labelled data? Cats. (Figures, right?) “It basically invented the concept of a cat,” Google fellow Jeff Dean told the New York Times. A year ago Google picked up image-recognition technology developed by Geoffrey Hinton at the University of Toronto and quickly put it to work on photos.google.com (login required) – they got Hinton part time, too. Last summer Google released word2vec, open source deep-learning software that runs on everyday hardware and can figure out relationships between words without training – that could have huge implications for software deducing concepts and intentions behind written and spoken language. A Google researcher speaking on background indicated he had high hopes for its use in education and information science.

What could Google do with deep learning?

What does Google see in DeepMind’s deep learning technology and (perhaps) applications that’s worth hundreds of millions of dollars? Nobody is saying – and both Google and DeepMind representatives declined to comment. But Google has many operations that could benefit:

Video recognition – Google says users upload more than 100 hours of new video to YouTube every minute. Google already scan new content looking for copyright violations and inappropriate material, but systems with deep learning capabilities could take the idea much further, perhaps recognizing people, objects, brands, products, places, and events. Of course, one focus could be piracy and copyright violations (potentially worth hundreds of millions to Google all by itself). But the technology could also better curate the millions of videos on YouTube, making suggestions and related videos much smarter.
Speech recognition and translation – Google Translate is already well regarded, but deep-learning neural networks could make it even better. Imagine traveling to a country where you don’t know the language and speaking with someone in a store using your smartphone; its microphone could hear their speech and pump an English translation into an earbud for you, then translate your speech for them. It’s not far-fetched: Microsoft Research has used the same deep-learning ideas pioneered by Geoffrey Hinton to significantly reduce error rates in speech recognition; combined with Bing Translator, they even have speech recognition, translation, and text-to-speech happening in near-real time.

Better search – Google’s empire is based on search, and Google has long used heuristics to refine results. (Searching for “football” this week will turn up more Super Bowl-related results than three months ago – at least for U.S. users.) Deep-learning technologies mean Google can better understand what people are searching for, producing better results. The same technology can also let Google better understand new information – think social-media posts, news items, and just-published Web pages – faster, delivering the “freshest” results more reliably.
Security – Deep learning and neural networks excel at pattern recognition, whether that’s pixels in an image or behaviors exhibited by users’ accounts or devices. Google could use deep-learning technologies to protect accounts and improve users’ trust in Google (no easy task these days). Security technology augmented by machine learning could not only look for suspicious behavior on individual accounts, but (perhaps more usefully) look at activity across the full breadth of Google’s services, identifying and shutting down malicious attempts to hack, phish, and manipulate users or employees.
Social – Google is already using deep learning technologies in Google+, so don’t be surprised when deep learning augments more social (and mobile) offerings. After all, Google needs to distinguish itself from competitors. Obvious examples include improved face recognition in videos and photos, as well as recognizing places and events, but the technology could go further, recognizing objects (skis, cameras, cars, holiday decor), products, clothing – heck, even types of food. After all, pictures of cats are only outnumbered on social networks by pictures of lunch.
Let’s not forget ecommerce – The bulk of Google’s revenue comes from online advertising, where deep-learning technologies could be applied to targeting users even more precisely with ads. But Google also wants to sell users movies, music, books, and apps via Google Play – and let’s not forget Google has been trying (not very successfully) to sell goods online via efforts like Google Shopping. Just as deep-learning technologies can enrich social experiences, they can power product recommendations and custom offers, perhaps helping Google compete with the likes of Amazon and Groupon.

Google will have to walk a fine line: Any of these applications could exponentially increase Google “creep factor” as leverage our personal data. Curiously, Google’s acquisition of DeepMind reportedly includes oversight by an internal ethics board.

Will DeepMind help the “Google Brain?”

So what the effort to create an artificial intelligence on par with human intellect? Sadly for fans of robot overlords, the DeepMind acquisition is at best peripheral to that effort, and probably unrelated.

“I’m glad to hear the news about Google’s acquisition of DeepMind, since it will attract more attention to this field,” noted Pei Wang, an artificial general intelligence researcher at Temple University. “However, in my opinion, reinforcement learning and deep learning are not enough to give us ‘thinking machines.'”

Google is still a long way from achieving the processing scale of a human brain, let alone understanding how it works.

Part of the problem is scale. Google’s neural network that identified cats had 16,000 nodes, while a human brain has an estimated 100 billion neurons and 100 to 500 trillion synapses. Even Google doesn’t have that kind of computing horsepower sitting around.

More significantly, a “node” in a neural network – even one trained by deep learning – doesn’t correspond to a biological neuron. We still only have general ideas of how neurons work. If we want to build human-level intelligence by emulating biological processes, that means modeling physical and chemical details of neurons – and that’ll take even more computing power. Efforts have been made: In 2005, a 27-processor cluster took 50 days to simulate one second of the activity of 100 billion neurons; since then, the biggest brain simulation effort has probably been IBM’s 24,576-node effort to simulate a cat brain – although it did not model individual neurons.

In other words, Google is still a long way from achieving the processing scale of a human brain, let alone understanding how it works. Even with DeepMind.