Twitter is an indispensable source of live commentary and news. We know that, the site’s fervent users know that, and media agencies know that. The task for news outlets looking to harness the platform’s deluge of data, however, has always been how best to find breaking events as they occur. Plenty of journalists track the site manually or by using third-party analytics software, but would an automated system be able to do the job more efficiently?
That’s the question Reuters set out to answer two years ago when it began work on its news-surfacing algorithm. Last week, it unveiled its custom tool, dubbed the “Reuters News Tracer” to NiemanLab and the Columbia Journalism Review.
The news agency claims its software is capable of identifying breaking news events on Twitter while initial reports are still coming in. The process begins with detection, which sees the algorithm bunch relevant tweets into events, allowing it to generate metadata regarding the story’s topic. Tweets that contain the words “explosion” and “bomb,” for example, could be categorized together as denoting a potential terrorist attack.
After detection comes the problematic task of verification. With fake stories currently posing a problem for social media sites, such as Facebook, authenticity is of the utmost importance for a news publisher.
According to Reuters, its biggest (and most important) challenge when developing the algorithm was figuring out what events were newsworthy and not spam. In order to do this, its system assigns a verification score to tweets based on 40 factors, including whether the report is from a verified account, how many followers the account has, whether the tweets contain links and images, and the structure of the tweets themselves. “Amazingly enough, a tweet that is entirely in capital letters is less likely to be true,” Reg Chua, Reuters’ executive editor of data and innovation, told NiemanLab.
If the tweets assigned under an event reach a certain combined score, Reuters claims it has the assurance it requires in order to tweet and report on the issue. The score will then change as more reporters begin covering the story, with some events decreasing and others increasing in value.
According to Chua, the tool has helped Reuters to cover the news faster. For example, the agency received a 15-minute head start on a news alert regarding the Chelsea bombing in New York in October. It also helps Reuters to cover “witnessable events” in places where it may not have a physical presence.
“With the proliferation of smartphones and social media, it means that there are lot more witnesses to a lot more events,” said Chua. “We can’t be at everything. Our tool helps shift some of the burden of witnessing and lets journalists do much more of the high value-added work.”