The photograph above is pretty straightforward — it doesn’t take too much expertise to see that it’s of President Barack Obama. And even if you’ve been living under a rock for the past decade or so, the White House and presidential seals are pretty big clues — for a human, that is. But for a computer? Not so much. But one newly re-branded company is working to change that with software that can identify historical figures from centuries before the word “artificial intelligence” was even thought of.
Project Gado first began as a way to develop new technology for quickly digitizing visual history through scanning hardware, from images to sheet music. Six years later, they’ve accomplished that goal and set out on another challenge — making sense of all that visual data, quickly and efficiently.
Now re-branded as Gado Images, the company is teaching computers how to interpret each image, automatically adding tags, labeling historical figures and even writing a caption. That photo of Obama? Not only did the beta software correctly identify the president, where he was, and what he was doing, but it also knew enough to indicate that he’s the 44th president and that the image was taken on December 5, 2013. And the system works to identify much older historical figures as well, along with their significance — which is pretty good considering most American’s don’t know that Franklin Roosevelt was a president. Currently, the system can recognize over 60,000 well-known personalities, both current and historic.
While a historian could properly identify many images, the Cognitive Metadata Platform makes it possible to label large volumes of visual history quickly and efficiently, simplifying a challenge that many museums and art galleries face.
The web-based Cognitive Metadata Platform, currently in beta testing, is expected to launch later this year. The software works by combining a few existing concepts (like IBM’s Watson and Google Vision) with original programming. The system combines facial recognition and object recognition as well as optical character recognition to identify and read text to generate a few keywords.
Using neural networks, the system is able to make sense of all that those keywords and even add related terms. When the facial recognition identifies Martin Luther King Jr. in an image, it will also add related keywords like “civil rights,” for example.
“By combining all the different inputs together [the system] can become quite confident in what it’s seeing in the image,” Gado Images’ CEO, Thomas Smith, told Digital Trends. “It’s a learning system, learning relationships between entities and specific collections.” If the program knows that two people are often seen together, it will often tag both people or generate related keywords, he added.
The system can even estimate a time frame for when the image was taken, even on scans of photographs that come with no digital metadata. Using historical information like the person’s birth date and age-approximation software, the program can venture a pretty good estimate on when the photo was shot.
For historical purposes, the optical character recognition is also playing a big role. Many archival photographs will have a date or notes from the photographer on the back. With the OCR and a scan of both the front and back, the system can add the original notes, as well as adding more keywords relevant to those notes.
The program works on more than just photographs, and can help archive a number of different types of visual history, including drawings and even sheet music. Here’s how the system interpreted a few more visuals, including a postcard and a drawing:
The program was even able to correctly identify Digital Trends from a text-free logo, adding “consumer electronics” and “news” among the list of keywords.
As a neural network system, the platform will continue to “learn,” tracking any human input to use with future scans.
The platform will even put the keywords together into a brief caption. “In some cases, we have human annotators write a caption for the image, and use these entities as keywords,” Smith explained. “In other cases, the CMP actually uses natural language processing to turn the entities into an automatically generated, human-readable sentence caption for the image.”
In the image of Obama, the software generated this short caption: “44th President of the United States Barack Obama at the White House, Washington, DC, December 5, 2013.”
Through a partnership with Getty Images, Gado Images is working to allow museums, art galleries, and private collectors to monetize their collections. That gives non-profit historical organizations a recurring revenue source, and also saves time (and money) on the scanning and labeling process. In order for buyers to find an image they would like to use, they need to be tagged correctly, which is where the keywords and captions from the Cognitive Matadata Platform come in.
“Its a great way for a cultural heritage organization to take their archives and turn it into a source of recurring revenue,” Smith said. Along with serving as a platform to monetize historical collections, the automatic keyword and caption generation saves organizations from the time — and cost — required to add the information manually.
While the company has moved from physical hardware to software development, the new program still jives with the company’s mission to digitize and share the world’s history. One of the company’s earliest projects was to help the Baltimore Afro American — the longest running African American Newspaper — digitize a 1.5-million image archive. “It’s really expensive and challenging to digitize and annotate archives at large. They had only scanned about 5,000 images at the time, so we launched Gado to help organizations to scan images and then make sense of those photos in a way that makes sense for a small organization,” Smith said.
While the software’s current focus is historical visual data, Smith says the software has the potential to be used in a variety of industries in the future. For example, a media company could use the system to create an archive and easily find images of a specific location, object, or person, or photographers could use the system to archive their work and quickly bring up an image of a specific client.