Skip to main content

Startup uses deep learning to let you shop for items by snapping photos

deep learning startup retail photos teaser 2
Image used with permission by copyright holder
Computer vision and image recognition, aka the ability to show a computer a picture of something and have it tell you what it is seeing, has been befuddling researchers for years.

In the past decade, however, it has taken enormous leaps forward thanks to deep learning neural networks. These are essentially vast computational approximations of the way that the human brain works, which can learn to recognize objects and people based on training examples.

A number of companies have attempted to use this kind of technology to transform the retail space, but so far none have really succeeded. We scan QR codes when we go into a store, or type the name of a book into Amazon, but the technology that lets us snap a quick picture of, say, a chair that we like and easily search for it (or similar items) online has largely eluded us.

This is a problem a team of Cornell University researchers are trying to solve. With a new startup called GrokStyle, co-founders and computer scientists Sean Bell and Kavita Bala have developed a state-­of-­the-­art algorithm that is more accurate than any competing system at recognizing objects within a picture and then linking them to real­-world items for sale.

teaser2
Image used with permission by copyright holder

“What we’re focused on is the post­-search experience,” Bell, a computer science Ph.D., told Digital Trends. “That’s something I don’t feel has been very good so far. What we’re doing is not just recognizing a product, but finding out who else sells it, whether there are existing variations, whether you can get it in a different type of wood, etc. We’re not just trying to answer the ‘what is this?’ question about objects, but trying to answer the related questions people have in the shopping experience.”

As you can imagine, this isn’t easy ­– ­and particularly not since it does more than apps, which recognize, say, book covers and movies by letting you search for that item and that one alone.

“You may be in a restaurant and see a chandelier you like, and want to one to find one that is the same and available for sale — or similar, but in a different color or price point,” Bell said. “The idea is that you take a picture of something you like, and then based on that image, you get presented with a list of similar items. You can then filter these based on location, material, and various other metrics. It’s not just simply about either finding something or not finding it; we wanted to give you a wide variety of choice.”

Going forward, the idea is to even let users single out particular qualities of an object (say, its grain of wood or fabric) and then find other complimentary items. “Our system lets you hold certain aspects as constants, and then change other attributes,” he said.

The system the team has developed, described in the journal ACM Transactions on Graphics under the name “Learning visual similarity for product design with convolutional neural networks,” was benchmarked to be an astonishing two times as accurate as competing methods. It is hoped that users will be able to try out the system over the coming months.

“This is an exciting area, and ultimately we think it’ll come down to who has the best technology,” Bell said. “Right now we’re state of the art in terms of product image recognition. We’ve demonstrated to the academic community that our system for doing this is the most accurate. We want to continue that edge going forwards.”

Editors' Recommendations

Luke Dormehl
I'm a UK-based tech writer covering Cool Tech at Digital Trends. I've also written for Fast Company, Wired, the Guardian…
Digital Trends’ Top Tech of CES 2023 Awards
Best of CES 2023 Awards Our Top Tech from the Show Feature

Let there be no doubt: CES isn’t just alive in 2023; it’s thriving. Take one glance at the taxi gridlock outside the Las Vegas Convention Center and it’s evident that two quiet COVID years didn’t kill the world’s desire for an overcrowded in-person tech extravaganza -- they just built up a ravenous demand.

From VR to AI, eVTOLs and QD-OLED, the acronyms were flying and fresh technologies populated every corner of the show floor, and even the parking lot. So naturally, we poked, prodded, and tried on everything we could. They weren’t all revolutionary. But they didn’t have to be. We’ve watched enough waves of “game-changing” technologies that never quite arrive to know that sometimes it’s the little tweaks that really count.

Read more
Digital Trends’ Tech For Change CES 2023 Awards
Digital Trends CES 2023 Tech For Change Award Winners Feature

CES is more than just a neon-drenched show-and-tell session for the world’s biggest tech manufacturers. More and more, it’s also a place where companies showcase innovations that could truly make the world a better place — and at CES 2023, this type of tech was on full display. We saw everything from accessibility-minded PS5 controllers to pedal-powered smart desks. But of all the amazing innovations on display this year, these three impressed us the most:

Samsung's Relumino Mode
Across the globe, roughly 300 million people suffer from moderate to severe vision loss, and generally speaking, most TVs don’t take that into account. So in an effort to make television more accessible and enjoyable for those millions of people suffering from impaired vision, Samsung is adding a new picture mode to many of its new TVs.
[CES 2023] Relumino Mode: Innovation for every need | Samsung
Relumino Mode, as it’s called, works by adding a bunch of different visual filters to the picture simultaneously. Outlines of people and objects on screen are highlighted, the contrast and brightness of the overall picture are cranked up, and extra sharpness is applied to everything. The resulting video would likely look strange to people with normal vision, but for folks with low vision, it should look clearer and closer to "normal" than it otherwise would.
Excitingly, since Relumino Mode is ultimately just a clever software trick, this technology could theoretically be pushed out via a software update and installed on millions of existing Samsung TVs -- not just new and recently purchased ones.

Read more
AI turned Breaking Bad into an anime — and it’s terrifying
Split image of Breaking Bad anime characters.

These days, it seems like there's nothing AI programs can't do. Thanks to advancements in artificial intelligence, deepfakes have done digital "face-offs" with Hollywood celebrities in films and TV shows, VFX artists can de-age actors almost instantly, and ChatGPT has learned how to write big-budget screenplays in the blink of an eye. Pretty soon, AI will probably decide who wins at the Oscars.

Within the past year, AI has also been used to generate beautiful works of art in seconds, creating a viral new trend and causing a boon for fan artists everywhere. TikTok user @cyborgism recently broke the internet by posting a clip featuring many AI-generated pictures of Breaking Bad. The theme here is that the characters are depicted as anime characters straight out of the 1980s, and the result is concerning to say the least. Depending on your viewpoint, Breaking Bad AI (my unofficial name for it) shows how technology can either threaten the integrity of original works of art or nurture artistic expression.
What if AI created Breaking Bad as a 1980s anime?
Playing over Metro Boomin's rap remix of the famous "I am the one who knocks" monologue, the video features images of the cast that range from shockingly realistic to full-on exaggerated. The clip currently has over 65,000 likes on TikTok alone, and many other users have shared their thoughts on the art. One user wrote, "Regardless of the repercussions on the entertainment industry, I can't wait for AI to be advanced enough to animate the whole show like this."

Read more