Skip to main content

Neural network can create high-res images based on a text description

Image Synthesis From Text With Deep Learning | Two Minute Papers
As far as artificial intelligence goes, 2016 has been the year of deep learning. Brain-inspired neural networks have received massive amounts of investment in time, resources and funding — and, boy, has it ever paid off!

In a new piece of research — carried out by investigators at Rutgers University, the University of North Carolina at Charlotte, Lehigh University, and the Chinese University of Hong Kong — neural networks have been used to generate high quality images based on nothing more detailed than basic text descriptions.

“Generating realistic images from text descriptions has many applications,” researcher Han Zhang told Digital Trends. “Previous approaches have difficulty in generating high resolution images, and their synthesized images in many cases lack details and vivid object parts. Our StackGAN for the first time generates 256 x 256 images with photo-realistic details.”

A video of the work was shared online by YouTuber Károly Zsolnai-Fehér as part of his excellent series of Two Minute Papers educational videos.

Image used with permission by copyright holder

“For many years, we have trained neural networks to perform tasks like face, traffic sign, or handwriting recognition,” Zsolnai-Fehér told us. “Generally, with millions of training examples, we show the neural network how to do something, and expect them to learn these concepts, and do well on their own afterwards. This piece of work is completely different: here, after learning the neural networks are able to create something completely new — such as synthesizing new, photorealistic images from a piece of text we have written. This opens up a world of possibilities, and I am super-excited to see where researchers take this concept in the future.”

While there have certainly been examples of computational creativity before — ranging from MIT’s Nightmare Machine to projects that can generate predictive video simply by looking at a still image — this is nonetheless an intriguing piece of work. It’s also fascinating because the two-stage method of drawing images looks, to our way of thinking, a whole lot like the way artists will sketch out a piece of work, and then do a second pass to add detail.

We may still be a way from replacing human illustrators with robots, but this is nonetheless an exciting leap forward.

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Scientists are using A.I. to create artificial human genetic code
Profile of head on computer chip artificial intelligence.

Since at least 1950, when Alan Turing’s famous “Computing Machinery and Intelligence” paper was first published in the journal Mind, computer scientists interested in artificial intelligence have been fascinated by the notion of coding the mind. The mind, so the theory goes, is substrate independent, meaning that its processing ability does not, by necessity, have to be attached to the wetware of the brain. We could upload minds to computers or, conceivably, build entirely new ones wholly in the world of software.

This is all familiar stuff. While we have yet to build or re-create a mind in software, outside of the lowest-resolution abstractions that are modern neural networks, there are no shortage of computer scientists working on this effort right this moment.

Read more
The BigSleep A.I. is like Google Image Search for pictures that don’t exist yet
Eternity

In case you’re wondering, the picture above is "an intricate drawing of eternity." But it’s not the work of a human artist; it’s the creation of BigSleep, the latest amazing example of generative artificial intelligence (A.I.) in action.

A bit like a visual version of text-generating A.I. model GPT-3, BigSleep is capable of taking any text prompt and visualizing an image to fit the words. That could be something esoteric like eternity, or it could be a bowl of cherries, or a beautiful house (the latter of which can be seen below.) Think of it like a Google Images search -- only for pictures that have never previously existed.
How BigSleep works
“At a high level, BigSleep works by combining two neural networks: BigGAN and CLIP,” Ryan Murdock, BigSleep’s 23-year-old creator, a student studying cognitive neuroscience at the University of Utah, told Digital Trends.

Read more
A Star Trek fan deepfaked Next Generation-era Data into the new Picard series
Data Picard

Star Trek Picard: Fixing Data's Face with Deepfake

Brent Spiner reprised his role as Lt. Cmdr. Data for the 2020 CBS All Access series Star Trek: Picard, and while it was certainly a nice touch to see Spiner play the iconic synthetic life form for the first time in years, there was no getting around the fact that the character didn’t look entirely like the Mr. Data fans remember from his Star Trek: The Next Generation heyday.

Read more