Skip to main content

MIT's Pic2Recipe A.I. Can Predict Food Ingredients By Analyzing a Photo

Pic2Recipe: Predicting recipes from photos
Scrolling through food photography can bring on the desire to recreate a dish at home, but what if the ingredients aren’t listed? Could there be a way to find out just by analyzing the image? That’s what researchers at the Massachusetts Institute of Technology asked when they set out to create a deep learning algorithm that could predict a recipe based just on a photo. The research, published on July 20, resulted in a program called Pic2Recipe that could accurately predict a dish’s recipe based on a photo, with a 65 percent success rate.

Earlier attempts to turn photos into recipes were limited by smaller datasets — although “small” is relative to all the possible recipes available. One study used 65,000 recipes, but it only included traditional Chinese cuisine; another only had about a 50 percent accuracy in initial testing. Because deep learning algorithms “learn” from being fed large quantities of data, these resulting programs were missing large gaps in potential ingredients, affecting the program’s accuracy.

To create a larger database, the researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) knew the software would have to be based on a wide-ranging set of data. So to solve that narrow dataset, the team turned to large sets of photos and recipes that already exists — food websites. Compiling data from places like Food.com and All Recipes, the team created Recipe1M, a dataset of over one million recipes.

Using those recipes and the associated images, the team was able to train the software to use object recognition to pick up on what each dish’s ingredients might be. With a list of ingredients, the system then selected  the recipe that best matched the list. Pic2Recipe was able to recognize ingredients like flour, eggs, and butter.

The program doesn’t actually identify a recipe from the photo — it creates a list of ingredients. With that list, the program can then go through that one-million-recipe database and choose the one with ingredients that match the list from the photo.

“In computer vision, food is mostly neglected because we don’t have the large-scale datasets needed to make predictions,” said Yusuf Aytar, a postdoctoral associate who co-wrote the paper with MIT professor Antonio Torralba. “But seemingly useless photos on social media can actually provide valuable insight into healthy habits and dietary preferences.”

Since the computer already has that large dataset, it is also able to able to pick up on a number of different patterns, like that the average recipe has nine ingredients and the most popular are salt, butter, sugar, olive oil, water, eggs, garlic cloves, milk, flour, and onion.

The software could have a number of different real-world uses. A person could snap a photo at a restaurant to learn how to make the dish at home, or to track her personal nutrition.

The program, while it contains a wider dataset than earlier attempts, still has a few gaps. The researchers said the program has trouble with dishes that are a bit more ambiguous, like smoothies and sushi rolls. Similar recipes with a number of different variations, like lasagna for example, also tended to confuse the program.

The group plans to continue developing the program and even hopes to give the system the ability to tell how something is cooked, like picking up the difference between stewed and diced. Future work could also expand the program’s ability to recognize specific ingredients, like determining the type of onion instead of just listing onion.

You don’t have to wait until Pic2Recipe becomes a full fledged app to try it out. An online version allows users to upload images and try it out.

Editors' Recommendations

Hillary K. Grigonis
Hillary never planned on becoming a photographer—and then she was handed a camera at her first writing job and she's been…
Create apocalyptic A.I. worlds with this camera app that removes people from pics
bye camera removes people ai byebyecamera

What would that travel photo look like without tourists? How about that selfie without any “self,” or any photograph without any people, ever? A new app aims to make a statement about what the world would look like -- literally  -- if artificial intelligence replaced all the people. Bye Bye Camera is an iOS app that uses A.I. to nix the people from a photo.

While the app could have some real-world uses for when too many tourists enjoying the view ruin the actual view, Bye Bye Camera is an art project that aims to evoke questions about the growth of A.I. Designed by mononymous artist Damjanski and the art collective Do Something Good, the app looks at what a world without real intelligence may look like. Damjanski designed the app with two additional collaborators.

Read more
Groundbreaking A.I. can synthesize speech based on a person’s brain activity
Everything you need to know about Neuralink

Speech synthesis from neural decoding of spoken sentences

Scientists from the University of California, San Francisco have demonstrated a way to use artificial intelligence to turn brain signals into spoken words. It could one day pave the way for people who cannot speak or otherwise communicate to be able to talk with those around them.

Read more
Fujifilm’s successor to the wildly popular X100V has just landed
fujifilm unveils x100v successor x100vi

FUJIFILM X100VI Promotional Video/ FUJIFILM

Fujifilm has finally unveiled the successor to its super-popular X100V camera.

Read more