MIT and IBM’s new A.I. image-editing tool lets you paint with neurons

By Luke Dormehl July 2, 2019

Whether it’s automatically tagging objects in pictures or the ability to tweak lighting and separate subjects from their background using the iPhone’s “portrait mode,” there’s no doubt that artificial intelligence is a powerful force in modern photo-editing tools.

Contents

The future of creative tools
Challenging imagination

But what if it were possible to go one step further, and use the latest cutting-edge technologies to develop what may just be the world’s most ambitious (and, in its own way, imaginative) paint program — one that goes far beyond simply touching up or coldly analyzing your existing pictures?

Recommended Videos

With such a program, all a person would need to do to remove an unsightly line of cars sullying a picture of their family home would be to pass over it with a brush. As if by magic, the vehicles would be replaced by a photorealistic grassy bank. Want to eliminate that photobomber from one of your vacation snaps? No problem: Just click to select them and they’ll vanish in place of a utility pole that looks like it’s always been there. How about adding an authentically ancient door into a photo of an old church? Click and it’s done. You get the idea.

Editing Images with Neural Networks

This is what researchers at Massachusetts Institute of Technology and IBM are working toward with an amazing new tech demonstration they call the “GAN Paint Studio.” Described by its creators as providing the ability to “paint with neurons” — referring to the artificial neurons of a machine learning neural network — it’s one of the most potentially transformative photo-editing tools yet created.

It allows users to upload an image of their choosing and then modify any aspect of it they want, whether that’s changing the size of objects or adding in completely new items and objects. Think of it as Photoshop for the “deepfake” generation, albeit one that’s currently more of a proof-of-concept than a finished product.

The future of creative tools

“What we created with this work is a starting point to show how creative tools in the future could work,” Hendrik Strobelt, a research scientist at the MIT-IBM Watson A.I. Lab, told Digital Trends. “We started from a neural network [called a] GAN that can produce its own images of a certain category — for example, kitchen images — and analyzed which internal parts of the network are responsible for producing which feature. This allowed us to modify the images that the network produced. We ‘drew’ on them. The novelty we added is that you can upload your own image of this category and modify it with brushes that do not just draw strokes, but actually draw semantically meaningful units — such as trees, brick-texture, or domes.”

A GAN, or Generative Adversarial Network, is one of the most powerful tools used in generative artificial intelligence. A GAN pits two artificial neural networks against one another. One network generates new images, while the other attempts to work out which images are computer-generated and which are not. Over time, this generative adversarial process causes the “generator” network to become good enough at creating images that it can successfully fool the “discriminator” every time. A GAN was the technology behind the A.I. artwork that famously sold for big bucks at a Christie’s auction in 2018.

The system developed by the MIT and IBM researchers showcases some neat abilities. A bit like Deep Dream, the trippy image-generating tool developed by Google researchers several years back, it shows an impressive understanding of which images fit together. As a result of being trained on a vast archive of images, it picks up an understanding of the basic rules governing relationships between objects. For instance, ask it to add an object in the sky and it won’t draw a window — since it knows that windows aren’t usually (or ever) found there.

As Strobelt notes, GAN Paint Studio is not quite ready for prime time just yet. Although members of the public can have a go at using it, there’s still more work to be done. Notably, the demonstration version is currently low-resolution. However, it does showcase the immense promise of the technology.

Challenging imagination

“The most fun parts [of the technology] are actually when your imagination is challenged,” Strobelt said. “Try adding a door to the Palazzo Vecchio image; it’s kind of mind-blowing if you know the place. The system is far from perfect, and not every image can be modified equally well. There is still research needed on how to optimize all the parts. For example, when the GAN model tries to represent the input model, it might very well use the wrong semantical units to reproduce features — it [may] just generate a door out of tree units. Figuring out when and how it does do right or wrong is actually very interesting future work.”

“I see this as an advanced tool to help humans who think they are not creative to challenge this thought.”

Just as GANs get better over time, so Strobelt thinks that the applications for GAN Paint Studio will open up. “The obvious first idea would be a photo editor with these semantic brushes and erasers,” he said. “This could help you edit vacation photos, for example. It could also allow architects to quickly create variations on the embedding of their building renderings. Game designers could [also use it to] modify level maps quicker.”

If such technology could be added to video effects, it would also prove immensely powerful. This would allow objects to be placed into shots with just the touch of a button. Should a director realize they’ve forgotten to include a background item that’s crucial to the plot in a completed scene, it could be quickly added in — without the need for the current expensive and time-consuming visual effects processes.

Strobelt is decisive in saying that he doesn’t think GAN Pain Studio is truly, autonomously creative. “No,” he said, decisively. “I see this as an advanced tool to help humans who think they are not creative to challenge this thought.”

Then again, what is creativity? As with many other aspects of our lives, such as the jobs we believe only humans can do, it seems that A.I. is ready to ask the big questions.

Editors' Recommendations

Topics

Contributor

I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…

Emerging Tech

A.I. translation tool sheds light on the secret language of mice

Breaking the communication code

Ever wanted to know what animals are saying? Neuroscientists at the University of Delaware have taken a big leap forward in decoding the sounds made by one particular animal in a way that takes us a whole lot closer than anyone has gotten so far. The animal in question? The humble mouse.

Emerging Tech

We used an A.I. design tool to come up with a new logo. Here’s what happened

No matter what industry you work in, you’ve probably heard that artificial intelligence is coming for your job. Factory workers, news reporters, even stock brokers have all seen A.I. move into their fields, automating some of their roles. Proponents of automation point out that it tackles the menial, repetitive tasks, freeing workers to focus on more creative aspects.

Now, gig economy marketplace Fiverr recently announced a new A.I.-powered tool that helps businesses create a logo.

Emerging Tech

Fake news? A.I. algorithm reveals political bias in the stories you read

Here in 2020, internet users have ready access to more news media than at any other point in history. But things aren’t perfect. Click-driven ad models, online filter bubbles, and the competition for readers’ attention means that political bias has become more entrenched than ever. In worst-case scenarios, this can tip over into fake news. Other times, it simply means readers receive a slanted version of events, without necessarily realizing that this is the case.

What if artificial intelligence could be used to accurately analyze political bias to help readers better understand the skew of whatever source they are reading? Such a tool could conceivably be used as a spellcheck- or grammar check-type function, only instead of letting you know when a word or sentence isn’t right, it would do the same thing for the neutrality of news media -- whether that be reporting or opinion pieces.