Skip to main content
  1. Home
  2. Web
  3. Emerging Tech
  4. News

Yahoo just released a ton of user data in the name of academia

Add as a preferred source on Google

Yahoo just released a ton of data in the name of academia. In what is purported to be the largest ever cache of Internet data ever granted to researchers, the company is granting universities access to the online behaviors of some 20 million anonymous users, including their clicks, hovers, and scrolls across a myriad of Yahoo’s pages. The sheer volume of information, Yahoo says, should allow scientists further their work on machine learning and deep learning.

“Our goals are to promote independent research in the fields of large-scale machine learning and recommender systems, and to help level the playing field between industrial and academic research,” the Internet giant said in a blog post about the recent release. “The dataset is available as part of the Yahoo Labs Webscope data-sharing program, which is a reference library of scientifically useful datasets comprising anonymized user data for noncommercial use.” 

Recommended Videos

The decision comes as Yahoo faces an alarmingly static period during its two decades of existence, even as chief competitors like Google and other social media companies make huge strides across different fields within the tech industry. So in an effort to innovate, Yahoo is investing deeply in the realm of artificial intelligence, and allowing researchers to see exactly how people actually behave when they’re on the Internet.

Despite the fact that all the data is completely anonymized, users might be alarmed by how much Yahoo is actually telling these institutions (and only these institutions). “In addition to the interaction data,” Yahoo says, “we are providing categorized demographic information (age range, gender, and generalized geographic data) for a subset of the anonymized users. On the item side, we are releasing the title, summary, and key-phrases of the pertinent news article.” Further, the company will also reveal “the relevant local time and also contains partial information about the device on which the user accessed the news feeds, which allows for interesting work in contextual recommendation and temporal data mining.”

This comes as a huge boon to researchers who often don’t have enough data to work with in order to fully realize their projects. “Data is not easy to come by for folks not inside companies,” said Gert Lanckriet, a professor in the Department of Electrical and Computer Engineering, University of California, San Diego, at an event announcing the data release.

“We hope that this data release will similarly inspire our fellow researchers, data scientists, and machine learning enthusiasts in academia, and help validate their models on an extensive, ‘real-world’ dataset,” Yahoo concluded. “We strongly believe that this dataset can become the benchmark for large-scale machine learning and recommender systems, and we look forward to hearing from the community about their applications of our data.”

Lulu Chang
Fascinated by the effects of technology on human interaction, Lulu believes that if her parents can use your new app…
How to clear your browser cache in Chrome, Edge, Firefox, Safari, or Opera
A cluttered cache can slow you down and break websites, so here's how to clear it in every major browser in just a few seconds.
How to delete browser cache

A stocked computer cache may be convenient for logging into and out of go-to sites in seconds flat, but a major buildup of these tracking codes could significantly impact your PC’s performance. If you’ve noticed that your PC has been running rather slow of late, or you’re using a new browser and don’t know how to clear its cache, we’ve got you covered with the following guide.

Read more
How to find archived emails in Gmail and return them to your inbox
Archived emails in Gmail are easier to find than you think—once you know where Google hides them
Gmail icon on a screen.

If you’re looking to clean up your Gmail inbox, but you don’t want to delete anything permanently, then choosing the archive option is your best bet. Whenever you archive an email, it is removed from your inbox folder while still remaining accessible. Here’s how to access any emails you have archived previously, as well as how to move such messages back to your regular inbox for fast access.

Read more
Is there a Walmart Plus free trial? Get a month of free delivery
A Walmart sign on the outside of a store.

For regular Walmart shoppers, signing up for Walmart Plus is a no-brainer. It's basically Walmart's version of Amazon Prime, with subscribers unlocking free shipping on most orders, early access to discounts and new product drops (like Nintendo Switch 2 restocks), the best grocery delivery, and more. If you're always taking advantage of Walmart's bargains for the best smart home devices or the best tech products in general, but you're still not sure if you'll be able to maximize the benefits of Walmart Plus, we highly recommend claiming the free trial to the service, and we've got everything you need to know about it right here.

START YOUR FREE TRIAL

Read more