Education has long been lauded as our most powerful weapon in the war on poverty, and literacy is perhaps education’s keystone. But around the world, there are some 750 million individuals who are unable to read and write, and two-thirds of that demographic are women.
Identifying who and where these individuals are is an ongoing challenge, one that has traditionally been addressed by household surveys, but this method is neither efficient nor economical, and organizations have been in search of an alternative. And now, there just might be a solution — mobile phone call records.
Pål Sundsøy at Telenor Group Research in Fornebu, Norway, believes he’s found a way to determine literacy rates using little more than readily available information from a mobile phone company. As the MIT Technology Review reported, Sundsøy began his research with a “standard household survey of 76,000 mobile phone users living in an unidentified developing country in Asia,” which was conducted on behalf of a mobile phone operator by a professional agency.
The survey logged participants’ mobile phone numbers and whether or not they could read. Sundsøy then matched this information with call data records from the mobile phone company, which allowed him to see “the numbers each person has called or texted, the length of these calls, air time purchases, cell tower locations, and so on.”
This, Sundsøy said, allowed him to determine where each of the phone users were when they were using their phones, and also helped him discover who phone users were calling or texting, the number of texts received, at what time, and so on. Using this treasure trove of information, the researcher could build a type of social network for users.
As a last step, MIT reports, Sundsøy “used 75 percent of the data to search for patterns associated with users who are illiterate, using a variety of number-crunching and machine learning techniques. He used the remaining 25 percent to test whether it is possible to use these patterns to identify illiterate people and areas where there is a higher proportion of illiterate people.”
The conclusions Sundsøy drew suggested that a number of factors seemed to anticipate illiteracy, the most common of which was location. “One explanation can be that the model catches regions of low economic development status, e.g., slum areas where illiteracy is high,” Sundsøy said. The proportion of incoming to outgoing texts was also an indicator, as “Illiterates tend to concentrate their communication on few people,” Sundsøy noted.
But most impressively, the scientist’s machine learning algorithm was able to accurately determine who among a population was indeed illiterate. “By deriving economic, social, and mobility features for each mobile user we predict individual illiteracy status with 70 percent accuracy,” he noted.
That said, Sundsøy’s study was concentrated within a single data set in a single location, and may need further and more robust testing before being widely implemented by various aid agencies. All the same, it’s an interesting look into how phone records might reveal more about a population than previously believed.