Skip to main content

Shutterstock’s visual search engine could make browsing photos less of a chore


Perhaps somewhat ironically, we are used to searching for photos through text. Looking for a photo of a cat? Just type in “cat” in the Google image search bar, and it will return relevant photos (provided they were tagged as such, of course). Keywords will do the job at the most basic level, but what if you are looking for a specific type of photo? You could type “yellow cat” or some sort of generic description, but things become difficult as the description becomes more complex.

To address this, photo agency Shutterstock has just launched a new tool, called Reverse Image Search, that allows customers to upload a photo (up to 5MB) and find images that are similar. Using computer vision, Shutterstock says the tool breaks through the limiting ceiling of metadata.

Besides keywords, “the technology now relies instead on pixel data within images,” wrote Kevin Lester, who is Shutterstock’s vice president of engineering, in a blog post. “It has studied our 70 million images and 4 million video clips, broken them down into their principal features, and now recognizes what’s inside each and every image, including shapes, colors, and the smallest of details; this visual and conceptual data is represented numerically.”

Although this kind of computer vision-based search technology has existed for years (Google’s image search lets you do the same thing), when compared with similar tools offered by other stock photo services, Shutterstock says its technology is the most refined.

“It isn’t the first, but it’s the best on the market,” says Lawrence Lazare, Shutterstock’s product director for search and discovery.

The big benefits of using computer vision are accuracy and speed, and for Shutterstock’s customer base, it solves a major problem with search. It cuts down on the amount of time spent on searching for an image. If you are looking for inspiration or something generic, metadata (keywords) is easier, Lazare says. But if you are creating something specific — an ad campaign, for example — and you have specific requirements for what needs to be in that photo, then words aren’t as successful.

“Words are fallible — some pictures are hard to describe,” Lazare adds. “Some photos would require a short story to describe, and people don’t search like that.”

For example, typing in “sunset” into the search bar will result in 14,394 pages encompassing 1,439,383 photos, illustrations, and vector art that depict a sunset. And the photos are dependent on whether the photographer added keywords properly (sometimes a photographer will use a bunch of keywords to tag a batch of photos, say a wedding, but then may include photos that aren’t related).

The image on the left shows varied results when searching by keywords. The image on the right shows visually similar results when using visual search.
The image on the left shows varied results when searching by keywords. The image on the right shows visually similar results when using visual search. Image used with permission by copyright holder

You could narrow down the search results by adding additional keywords, like “city and architecture,” but, as it turns out, you’ll still have 140,330 options to browse through. It’s even more difficult when the photo in your head has nuances like the angle of a building or the color of a sunset.

Which is why visual similarity is more useful than keyword similarity, Lazare says, but this type of search requires a significant amount of machine learning, and it is not an easy task. When an image is uploaded, the computer breaks it down numerically — in a manner that it can understand — so that it can compare and contrast the important aspects of the image. The computer has to compare it against the millions of photos in Shutterstock’s archive, and do so incredibly quickly; it takes less than 20 milliseconds for the algorithms to compare and contrast 70 million images in real time. For the computer, some photos are easier to decipher, but when you have things like abstract art or colors, it’s a bit harder, and the computer is more likely to return “false positives.”

To achieve its success rates, the neural network utilized by Shutterstock’s computers required a lot of training. At the beginning, the first attempts weren’t good, but over time, the responses — reflecting the learning they were doing on their own — improved. Lester, who oversees search as well as the computer vision team, told us that in about a year’s time, the company managed to go from having nothing to having something that works well.

From our own experiments (the feature is live, and anyone can try it out by uploading an image), we can say the visual search tool is pretty good. Although it has trouble with complicated photos, it’s more successful with simpler ones. But Shutterstock, of course, isn’t the only company to develop a visual search engine: We noticed equally good results via Google’s image search, and many of Shutterstock’s competitors offer visual search as well (although Shutterstock showed us similar technology from competitors, and claims they aren’t as successful, hence one reason why they decided to build it from scratch).

We uploaded a fairly complicated photo, and threw the computer off. However, it does recognize that it's some type of architecture.
We uploaded a fairly complicated photo, and threw the computer off. However, it does recognize that it’s some type of architecture. Image used with permission by copyright holder

This all shows just how far along computer vision and machine learning have come in a relatively short time. And it’s only going to get better: Shutterstock is adding new tools to its network that would allow users to give its computers feedback about the quality of the search results, and will soon unveil visual search for its four million video footage assets, which is an even greater challenge than static photos.

Editors' Recommendations

Les Shu
Former Digital Trends Contributor
I am formerly a senior editor at Digital Trends. I bring with me more than a decade of tech and lifestyle journalism…
How to remove location data from your iPhone photos
How to transfer photos from an iPhone to an iPhone

We all love making memories, and a great way to collect those memories is to take a quick snap of a gorgeous landscape, a party in full swing, or a particularly incredible meal. The Apple iPhone now also adds a location to your pictures, meaning it can collate those images together into a location-themed album, or show you all the shots you've taken in a specific location. It's a fun little addition, and it's one that adds a lot of personality to the Photos app.

Read more
‘Photoshopped’ royal photo causes a stir
The Princess of Wales with her children.

[UPDATE: In a message posted on social media on Monday morning, Princess Kate said that she herself edited the image, and apologized for the fuss that the picture had caused. “Like many amateur photographers, I do occasionally experiment with editing," she wrote, adding, "I wanted to express my apologies for any confusion the family photograph we shared yesterday caused."]

Major press agencies have pulled a photo of the U.K.’s Princess of Wales and her children amid concerns that it has been digitally manipulated.

Read more
Nikon sale: Get up to $700 off select Nikon cameras and lenses
nikon d780 review product  1

Crutchfield has a huge sale on many different Nikon cameras with some of the best camera deals that we’ve seen in a while. With nearly 30 different items in the sale, the best thing that avid photographers can do is take a look for themselves. However, if you want a little insight before you dive in, take a look at what we have to suggest below.

What to shop for in the Nikon sale
Nikon makes some of the best DSLR cameras around with our overall favorite -- the -- available for $2,197 reduced from $2,297. The camera is perfect for both photographers and videographers with a 24.5-megapixel full-frame image sensor. Its rugged magnesium-alloy body is weather-sealed against dust, dirt, and moisture so it’s great for all occasions. The Nikon EXPEED 6 image processor is optimized for low-light performance while maintaining long battery life with an autofocus sensor module with support for 51 focus points. You just need to add a lens to reap the benefits with features like the 273-point phase-detection AF system detecting and tracking subjects throughout the entire frame.

Read more