In case you missed the movie Moneyball, here’s a quick summary: It’s based on the true story of Billy Beane – General Manager for the Oakland A’s American baseball team  – and how he used computer-generated statistical analysis to overcome a tight budget and wrangle a roster of inexperienced players. While Brad Pitt may have been nominated for an Oscar for his performance, the art of sabermetrics managed to steal the spotlight. 

We’ve fallen in love with data – look no further than the 2012 presidential election and the triumph of Nate Silver to prove this. And Silver, of course, got his start in stats with baseball. 

There’s also Shane Battier – currently an NBA player for the Miami Heat – who’s regaled as a “No Stats All-Star” for his immense pre-game preparation that entails exhaustive study of his opponents, especially the players he is assigned to guard. According to a New York Times feature on Battier, high-level statistical data compiled on all Battier’s opponents allows him to familiarize himself with the weaknesses of a better team.

The world of sports has clearly been able to turn metrics into measurable, real-world predictions … so why shouldn’t it work for other markets as well? Why not use math to see into music’s immediate future? It’s happening. Thanks to the rise of online music consumption and the use of social media to discuss musicians, we have a clearer window into music consumption than ever before. Artists looking to break through to mainstream success may need to look no further than the numbers to chart their way to the top. But the question remains: Can something as personal and abstract as music be based on metrics, or does fate still have a hand in it all?

The details in the fabric (of music data)

Big record companies have always crunched numbers to divine the next big acts – ultimately, every successful star is a cash cow for someone. The difference is, we now have a lot more numbers to look at than record sales and radio plays, and access to this information is available to anyone, not just record company bigwigs. You and I have the tools to root out promising musicians. Before we do, though, it’s important to know what data points are being analyzed to make these conclusions. 

Detail # 1: What we like, or more importantly, what we ‘like’ on our various social media outlets. Let’s face it – along with the hashtag and the ❤ button, the Facebook ‘like’ is powerful, maybe powerful enough to predict music’s next biggest shining star. Every time you post a YouTube video or your favorite song lyric, every time you use an app to invite friends to a concert you bought tickets to, every time you share that you bought an album, you are making it easier for the Internet – and the world – to determine which acts are worth watching.

Social media metrics are one of the key ingredients in the formula Next Big Sound uses to identify music successes in the making. Every member can study a comprehensive overview and tally of any music artist’s page views, likes, followers, and mentions on their official social accounts.  Comparisons with similar artists are made easy through detailed graphs.  For the casual and curious, this information is enough to go on – if a not-so-famous band’s Facebook page likes are rising to the millions, chances of them hitting it big by the end of the year are high. Same goes for the indie artist with a hundred thousand plus followers on Twitter. Once those heights are reached, that signals it’s time for the fan clubs, talent managers, and record label execs to take notice.

Detail # 2: What we buy. Music is a product, and we are its consumers. The study of consumer behavior and music-related purchase patterns opens the door to a lot of possibilities. When bands find out which of their songs are most liked, they can make sure they perform it more during their concerts. When record label companies see that a certain type of album is selling like hotcakes on iTunes, they make sure they sell more singles from that album or come out with a totally different (acoustic, live, string quartet) version of it.

A perfect example of using consumer behavior to music’s advantage is EMI Music’s One Million Interview Dataset. In partnership with Data Science London, EMI’s initiative promises to be the “richest and largest music dataset ever.” It’s comprised of a million interviews broaching topics like level of passion for a particular music genre and sub-genre, preferred methods for music discovery, favorite music artists, thoughts on music piracy, music streaming, music formats, and fan demographics.

David Boyle, Senior Vice President for Insight at EMI Music, is optimistic that by releasing this massive collection of information to the public, more people in the music industry will take notice and use data to improve the quality of the business. “We’ve had great success using data to help us and our artists understand consumers, and we’re excited to share some of our data to help others do the same,” Boyle says. “We also recognize that other people looking at this data will spot things we missed; different perspectives and experiences will tease out different insights. So we’re excited to see what people do with this data and to learn from that.”

EMI’s bigger data set can certainly be used to reveal which music artists people should watch out for this year.  According to Boyle, studying and analyzing music consumer behavior can arm users with better predictive powers toward acts whose careers can take off in the near future.

Detail # 3: What format we prefer. Has the convenience and ease of music-sharing online really affected revenues in the music business? How many people still prefer the physical CD over the digital MP3? Are there still enough people who want to reward creators of music to keep the industry afloat? According to EMI Music’s report, people no longer pay for music the way they did before, and recorded music sales have been in constant decline since 2001. Amassing actual music data straight from the source (music listeners) will enable them as well as other members of the music industry to figure out what the problem is and come up with a strategy that will satisfy true music fans.

These days, more people are accustomed to using music apps like Spotify and Pandora to listen to new music. The gateway to improving music discovery is wide open, and The Echo Nest is one of the companies taking its first steps toward it. It provides reliable music intelligence that can aid developers in building sophisticated music apps. That includes advanced music playlisting, taste profiling, personalized radio capabilities, music-related news feeds, gaming applications, and “fanalytics” – all the while backed up by more than a trillion (yes, trillion) data points connected to more than 30 million songs at its disposal.

In an article entitled, Data Science and the Music Industry: What Social Media Has To Do With Record Sales, members of the Next Big Sound team analyze the impact of social media on iTunes album and track sales by comparing one’s metrics with the other’s revenue. They confirmed the obvious: Social media did affect album and track sales. However, their specific findings are far more interesting. Radio plays and YouTube have the largest effect on track sales, and it makes sense: We hear a great song on our car radio, so we go to YouTube to familiarize ourselves more with it at our own leisure. Knowing that, record label execs will now prioritize concocting spectacular music videos on YouTube for singles they release to captivate a bigger audience.

For album sales, it gets a little trickier – to study how social media affects it, activity a week before and a week after album release are both considered.  Their analysis reveals that album sales are most affected by – get this – Wikipedia page views. Consumers need to know more about an artist before they get invested, so it is imperative for artists to keep their Wikipedia page relevant and up-to-date.

Detail # 4: What the math says. EMI Music, together with Data Science London, organized a Music Data Science Hackathon last July, wherein data scientists were given access to parts of EMI’s dataset. They could apply their own algorithms to it to try and predict what type of music people would love. Shanda Innovations, a tech incubator of Shanda Corporation from China, won the competition.

What – and who – is about to hit the big time

So we have the data. Now what can we tell from it?

“If you are looking to find who is really going to blow up in 2013, acts such as Atlas Genius, HAIM, Jessie Ware, and Trinidad James come highly recommended. Or at least that is what the numbers say,” says Liv Buli, a data journalist for Next Big Sound.

Those who are involved with the dataset project research new artists meticulously at very early stages before sharing any information about them. “Often we’ll see very exciting results before the public has had chance to fall in love with an artist,” said Boyle. Here’s what they’re willing to offer, though:

The company plans to release an updated and more specific dataset to the public sometime this year.

The real question is … can it work?

Music discovery will always be a challenge, despite having apps that are supposed to make it easier. Music data can certainly aid the industry in its quest to be better, and these services and algorithms are definitely a step in the direction of “it’s possible.” However, there’s always room for doubt when it comes to the future – even developers themselves will say so.

“Music is not a math problem,” said Shane Tobin of The Echo Nest during last year’s SF MusicTech Summit, according to TechHive. “It has to be informed from a human element. The way our recommendations work is by understanding what humans have to say.”

Thankfully something as intangible as music can’t be summed up in an equation – and the human touch remains the most important factor to consider in determining the next biggest thing in music. But a large part of music discovery relates to social behaviors, and it just so happens that much of our social interactions are happening online. As long as those who plan to manipulate music data are able to integrate personal taste and recommendation into their projects in a seamless manner, there’s no reason why it can’t work.