Skip to main content

Scientists develop program that matches doodles with photographs

scientists develop program scan doodles match with photographs 5925038217 d38ae770bd b
Ian Norman/Flickr
As FindFace has shown, image recognition has made incredible strides over the past few years. Despite this, artificial intelligence still struggles with certain types of images, such as drawings and doodles. A team of scientists is looking to change that with a new program that could match doodles and sketches to similar photographs.

Scientists believe the solution to bridging this gap is deep learning, a type of artificial intelligence that uses neural networks to analyze and process images. What are neural networks, you ask? Neural networks are brain-like computer systems designed to pick out and create patterns based on inputs.

Recommended Videos

In an effort to get around the current limitation of image recognition techniques, scientists have started work on a program that will be able to take a doodle or sketch of an image and find a matching photography – a visual search engine, if you will.

Please enable Javascript to view this content

One of the main reasons doodles have stumped image recognition programs in the past is that many times, the drawings aren’t perfect representations. Rather than being lifelike recordings of what it is we’re doodling, sketches and drawings are merely how we visualize something.

These less-than-accurate representations, be it on purpose or due to lack of drawing skills, often leads to dramatized proportions and missing elements that might not prevent humans from recognizing the image, but causes problems for computers. For example, that stick figure you drew in third grade likely wouldn’t be recognized as a human by an image recognition program, as its body is nothing more than a few lines placed below a circle – not the lifelike proportions of the human form.

While still in its infancy, New Scientists notes the program has already proven viable. Using more than 600 individuals, scientists displayed a random image for only two seconds and asked the participant to sketch the pictured object from memory. These resulting sketches were put into one neural network while the images shown to the participants were put into another. The two networks then worked together to analyze the sketches and match them to their original picture.

In the initial testing phase, the program was able to match the doodles up to their original image 37 percent of the time. Seemingly unimpressive, it’s important to note that humans only matched the sketches and images correctly 54 percent of the time. While not conclusive, these preliminary numbers show that with a little tweaking and more inputs, the program could very will surpass humans capabilities to match sketches to photos.

Where would this technology be useful though? The possibilities are limitless, really. Looking at it from a consumer standpoint, this tech could be used to create an app that lets you search for images and artwork based on doodles of your own. Imagine being able to draw up a little sketch, scan it with your phone, and find similar artwork to be inspired by or to compare your own work to.

From a more commercial standpoint, one of the most useful features of this technology would be to help police identify criminals by comparing sketch artists renderings to mugshot databases.

More thorough findings and information on the system is set to be released at the SIGGRAPH conference in Anaheim, California, in July.

Gannon Burgett
Former Digital Trends Contributor
Lambda’s machine learning laptop is a Razer in disguise
The Tensorbook ships with an Nvidia RTX 3080 Max-Q GPU.

The new Tensorbook may look like a gaming laptop, but it's actually a notebook that's designed to supercharge machine learning work.

The laptop's similarity to popular gaming systems doesn't go unnoticed, and that's because it was designed by Lambda through a collaboration with Razer, a PC maker known for its line of sleek gaming laptops.

Read more
An Amazon A.I. scientist wants to transform downtown Jackson, Mississippi
Nashlie Sephus

Most people look at a couple of vacant lots and see … vacant lots. But Nashlie Sephus sees gold.

Sephus, a 35-year-old Black A.I. researcher with Amazon, plans to turn seven buildings and about 500,000 square feet of downtown Jackson, Mississippi, into a technology park and incubator. Her story, as detailed on Inc.’s Web site, is remarkable:
The 35-year-old has spent the past four years splitting her time between Jackson, her hometown, and Atlanta, where she works as an applied science manager for Amazon's artificial intelligence initiative. Amazon had acquired Partpic, the visual recognition technology startup where she was chief technology officer, in 2016 for an undisclosed sum. In 2018, she founded the Bean Path, an incubator and technology consulting nonprofit in Jackson that she says has helped more than 400 local businesses and individuals with their tech needs.
But beyond entrepreneurship and deep A.I. know-how, Sephus is eager to bring tech to a city hardly known for its tech roots. "It's clear that people don't expect anything good to come from Jackson," she told Inc. "So it's up to us to build something for our hometown, something for the people coming behind us."

Read more
The BigSleep A.I. is like Google Image Search for pictures that don’t exist yet
Eternity

In case you’re wondering, the picture above is "an intricate drawing of eternity." But it’s not the work of a human artist; it’s the creation of BigSleep, the latest amazing example of generative artificial intelligence (A.I.) in action.

A bit like a visual version of text-generating A.I. model GPT-3, BigSleep is capable of taking any text prompt and visualizing an image to fit the words. That could be something esoteric like eternity, or it could be a bowl of cherries, or a beautiful house (the latter of which can be seen below.) Think of it like a Google Images search -- only for pictures that have never previously existed.
How BigSleep works
“At a high level, BigSleep works by combining two neural networks: BigGAN and CLIP,” Ryan Murdock, BigSleep’s 23-year-old creator, a student studying cognitive neuroscience at the University of Utah, told Digital Trends.

Read more