Imaginative AI Dreams Up Its Own High-Resolution Street Scenes

This AI generates fake Street View images in impressive high definition

By Luke Dormehl August 16, 2017

Photographic Image Synthesis with Cascaded Refinement Networks

Remember when Apple had its disastrous launch of Apple Maps in 2012 and real-world geography suddenly received a dose of accidental “creativity” which replaced hospitals with supermarkets and turned bridges into death slides? Well, researchers at Stanford University and Intel have just debuted a new project which creates imaginary street scenes — except these folks have done it on purpose.

What the researchers have developed is an imaginative artificial intelligence that can create photorealistic Google Street View-style images of fake street scenes. These scenes are rendered in highly detailed 1,024 x 2,048 HD resolution.

A bit like a comic artist who draws a city backdrop by taking photo references from different places and weaving them together, the street scenes Stanford and Intel’s AI imagines are based on individual elements it saw during its training — and then combines them to create novel images. The technology that makes this possible is something called a cascaded refinement network, a type of neural network designed to synthesize HD images with a consistent structure. Like a regular neural network, a cascaded refinement network features multiple layers, which it uses to generate features one layer at a time. Each layer has a higher resolution than the layer which came before it. Layers receive a coarse feature from the previous layer and then compute the details locally; allowing synthesized images to be generated in a consistent way.

The result? Street images equivalent to a photo taken with a two-megapixel camera.

Image used with permission by copyright holder

While the work is an interesting example of computational creativity in its own right (think Google’s DeepDream or the Massachusetts Institute of Technology’s Nightmare Machine for other examples), this project’s creators think it has multiple real-world applications.

“One application is a new rendering pipeline for video games and animations,” Qifeng Chen, a Stanford Ph.D. researcher on the project, told Digital Trends. “We do not need artists to create the virtual scenes manually. An AI painter can automatically learn from real images and translate the real world content to the virtual world in video games and movies. This approach can save a lot of human labor and potentially synthesize photo-realistic images. The second motivation is that mental imagery is believed to play an important role in decision making and the ability to synthesize photo-realistic images may support the development of artificially intelligent systems.”

Right now, the project is only able to create variations on German streets, because these are the images it was trained on. Going forward, however, it could be possible for the system to expand its knowledge to generate streets styled after any city in the world.

You can read a paper describing the work here.

Editors' Recommendations