Flickr's simple 'Park or Bird' tool is actually a demo of complex image recognition

The reason for creating this new tool stems from this xkcd comic, which presented Flickr’s engineers with a challenge.

Flickr’s engineers have worked really hard in developing a new tool that can tell whether a photo was shot in a national park, and if it contains a bird. You simply upload an image, and within a couple seconds Flickr returns the results. You must be asking why Flickr would devote money, time, and resources to something our eyes can easily pick out? While the new “Flickr Park or Bird” feature seems pointless, it actually demonstrates complex image recognition software Flickr is employing in its search algorithms. What might seem easy for us humans to discern is slightly more complicated for computers, yet the feature shows how far software has come along and what the future of image search will be like.

Recommended Videos

Determining if an image was taken in a park is relatively easy, as long as GPS data is embedded. Flickr matches the GPS info with records in a database, and can tell you the exact name of the park where the photo was taken. If there is no info, results are returned as question marks; in one image we uploaded, Flickr had no GPS data to work off, but it was able to tell that it was taken indoors.

Recognizing a bird (or anything else for that matter) in an image is more involved. Flickr says its Vision team “has been working for the last year or so to be able to recognize more than 1,000 things in images using deep convolutional neural nets,” and one of the things its software is good at is finding birds. The method is a bit technical to explain (you can read more about it here), but simply put, the software matches an input image (image of a bird) against layers and layers of images; one layer “might recognize the most basic image features, such as short straight lines, corners, and small circular arcs,” while another layer has more complex shapes, and “further layers might recognize higher-level concepts, like eyes and beaks.”

Flickr says its Vision team “is already applying this deep network to Flickr photos to help people more easily find what they’re looking for via Flickr search, and we plan to integrate it into Flickr in other cool ways in the future. We’re also working on other innovative computer vision and image recognition technologies that will make it easier for Flickr members to find and organize their photos.” By recognizing what’s in a photo, users in the future won’t have to manually tag what’s in them using text, as the software will be able to pick those things out automatically.

It's not perfect, as this image upload shows. Flickr couldn't determine where it was shot due to missing GPS info, but it also thought this famous Internet feline is a bird. — It’s not perfect, as this image upload shows. Flickr couldn’t determine where it was shot due to missing GPS info, but it also thought this famous Internet feline is a bird. Image used with permission by copyright holder