How did they do it? It began with a novel data set Yahoo built itself, composed completely of hateful or otherwise offensive article comments previously noted by Yahoo editors (yes, human beings). Then, the team applied a process known as “word embedding,” which allowed them to examine words in strings. That means that even if a single word isn’t inherently offensive, the algorithm is able to determine whether the phrase comprising those words is ultimately hurtful. This differs from most other systems available, which are generally on the lookout for keywords, but may miss more sophisticated sorts of hate speech or abusive content.
“Automatically identifying abuse is surprisingly difficult,” researcher Alex Krasodomski-Jones of the U.K.-based Centre for Analysis of Social Media told the MIT Technology Review. “The language of abuse is amorphous — changing frequently and often used in ways that do not connote abuse, such as when racially or sexually charged terms are appropriated by the groups they once denigrated.”
He continued, “Given 10 tweets, a group of humans will rarely all agree on which ones should be classed as abusive, so you can imagine how difficult it would be for a computer.”
Still, having a machine’s assistance in the process seems like a helpful step moving forward, especially given the sheer volume of content now available on the web.
- What does it take to make a social media network that doesn’t exploit users?
- Deep-learning A.I. is helping archaeologists translate ancient tablets
- The best subreddits you aren’t already subscribed to
- Filter by positivity: This new A.I. could detoxify online comment threads
- We all live in a media bubble. This app wants to burst it