Croudsourcing Key to Netflix Contest Winners

By Ian Bell October 2, 2009

BellKor's Pragmatic Chaos — Image used with permission by copyright holder

If you procrastinate after three hours of work, imagine how hard it is to keep coming back to a project over the course of three years.

That’s exactly what seven engineers, researchers and scientists from around the globe did in an attempt to improve Netflix’s movie recommendation algorithm by 10 percent or more. And their diligence paid off recently when the movie rental company awarded $1 million to team BellKor’s Pragmatic Chaos.

The team submitted its final formula about 20 minutes before the contest ended back in late July, beating out close competitor The Ensemble. More than 50,000 people vied from the prize over the course of the three-year competition.

The Method to the Madness

BellKor’s Pragmatic Chaos is a combination of three teams (BellKor, PragmaticTheory and Big Chaos) that joined forces to finish their submission to the competition. The members are: Bob Bell and Chris Volinsky, of the statistics research department at AT&T research; Andreas Töscher and Michael Jahrer, machine learning researchers and founders of commendo research and consulting in Austria; electrical engineer Martin Piotte and software engineer Martin Chabbert of Montreal, founders of Pragmatic Theory; and Yehuda Koren, senior research scientist at Yahoo! Research Israel. They met for the first time on Monday, Sept. 21, when Netflix announced the winners.

BellKor’s Pragmatic Chaos became the first team to surpass 10 percent in June, which sparked a 30-day period during which other contestants could try to beat their score. Rival team, The Ensemble, submitted its solution in late July just minutes before the deadline. BellKor’s winning entry improved Netflix’s existing system by 10.06 percent.

The attempt to produce a 10-percent reduction in the root mean squared error (RMSE) of test data as compared to Cinematch, the technology Netflix currently uses to recommend movies to members, drew upon collaborative filtering. The methodology looks at past behavior of users who share the same rating patterns to formulate a prediction for other users. Using a dataset of one million movies, BellKor’s Pragmatic Chaos worked algorithms and drew upon “a variety of models that complement the shortcomings of each other,” according to one of the papers published by team BellKor.

They included nearest neighbor models (which identify pairs of items that tend to be rated similarly by a user to predict ratings for an unrated item) and latent factors (which probe hidden features that explain the observed ratings). The team also peered behind the ratings to uncover additional data such as what movies a person rated.

The team was able to determine that:

viewers use different criteria to rate movies they saw a long time ago compared to ones they saw recently; and
some movies seem to grow on viewers in time and viewers rate movies differently on different days of the week.

Using that information, the team created a three-dimensional model that focused on how time affects the relationship between people and movies.

A Winning Combination

While the methodology behind the solution is important, perhaps more interesting was the contest’s indication that crowdsourcing can produce better results than looking in-house.

Chris Volinsky of team BellKor’s says Netflix made a smart move by “realizing that there was a research community out there that worked on these kinds of models and was starving for data.

“Netflix had the data, but only a handful of people are working on the problem,” he says. “The prize connected these two in a way that was sensitive to their proprietary data … This model doesn’t work for every domain — it worked here because the data was interesting, and it was a compelling topic. Everyone can relate to movies. A similar competition for, say, automatic language translation, might not generate as much passion.”

Andreas Töscher, originally of team Big Chaos, agreed that more competitions like Netflix’s are in store. He spoke to the remote nature of his team’s particular crowdsourcing experience — prior to Monday, he hadn’t even spoken to his teammates let alone lay eyes upon them. “It was great meeting the rest of the team, after working together for over a half year. We had never a telephone call. From Martin and Martin, we had not seen pictures until one week ago.”

Martin Chabbert, who was originally part of the PragmaticTheory team, says that while it was hard to focus on the contest while juggling work and family responsibilities, it was harder to avoid logging on the computer to test out a new idea for the project. While his engineering background helped the team’s efforts, not getting bogged down by the theoretical aspects of the work helped equally.

“I think one of the important qualities to being successful in this field is the ability to translate intuition about human behavior into an actual mathematical and algorithmic model,” Chabbert says. “A lot of people have ideas as to what should be captured, but the key is in finding the proper way to capture it. I believe that we did a good job at that. Also, not coming from an academic background, we were very focused on the task at hand, rather than trying to find things that had theoretical grounding or that would necessarily advance the general science.”

The father of four says each of the members of his team certainly brought something that contributed to the winning score. Team BellKor member Yehuda Koren’s algorithms and papers were paramount, while BigChaos’ management of all of the models and prediction sets coming from each sub-team proved key. Chabbert and Martin Piotte credit their “pragmatic” approach for yielding a wide array of original models and combinations.

Volinsky says the AT&T IP organization owns the intellectual property to the inventions from the competition, but would consider looking for opportunities to license them externally. All three of the teammates say they will consider entering Netflix’s second competition, which will focus on creating taste profiles for individual users based on demographic and usage data.

Lauren Fritsky is a freelance writer and professional blogger based outside Philadelphia. Her work has appeared in several newspapers and magazines and on sites such as AOL and CNN.

Editors' Recommendations