Skip to main content

Yahoo is open sourcing its deep learning model to identify pornography

Yahoo
Image used with permission by copyright holder
Looking to keep content that is not safe for work off your work screen? Artificial intelligence may be able to help you do that. On Friday, Yahoo research engineer Jay Mahadeokar and senior director of product management Gerry Pesavento published a blog post announcing the release of the company’s “deep learning model that will allow developers to experiment with a classifier for NSFW detection, and provide feedback to us on ways to improve the classifier.” In essence, Yahoo is open sourcing its algorithms for detecting pornographic images.

“Automatically identifying that an image is not suitable/safe for work (NSFW), including offensive and adult images, is an important problem which researchers have been trying to tackle for decades,” the Yahoo team wrote on Friday. “With the evolution of computer vision, improved training data, and deep learning algorithms, computers are now able to automatically classify NSFW image content with greater precision.”

Recommended Videos

That said, an open source model or algorithm for identifying NSFW images doesn’t currently exist, Yahoo pointed out. As such, “in the spirit of collaboration and with the hope of advancing this endeavor,” the company is filling the gap, providing a machine learning tool that focuses on solely identifying pornographic images. The reason for this specificity, Yahoo explained, is that “what may be objectionable in one context can be suitable in another.” But for these purposes, porn is decidedly unsuitable.

Please enable Javascript to view this content

Yahoo’s system uses deep learning to assign an image a score between 0 and 1 to determine just how NSFW it really is. “Developers can use this score to filter images below a certain suitable threshold based on a ROC curve for specific use-cases, or use this signal to rank images in search results,” Yahoo said. But bear in mind that there’s no guarantee of accuracy here — really, that’s where you come in. “This model is a general purpose reference model, which can be used for the preliminary filtering of pornographic images,” the blog post concluded. “We do not provide guarantees of accuracy of output, rather, we make this available for developers to explore and enhance as an open source project.”

Lulu Chang
Former Digital Trends Contributor
Fascinated by the effects of technology on human interaction, Lulu believes that if her parents can use your new app…
Turns out, it’s not that hard to do what OpenAI does for less
OpenAI's new typeface OpenAI Sans

Even as OpenAI continues clinging to its assertion that the only path to AGI lies through massive financial and energy expenditures, independent researchers are leveraging open-source technologies to match the performance of its most powerful models -- and do so at a fraction of the price.

Last Friday, a unified team from Stanford University and the University of Washington announced that they had trained a math and coding-focused large language model that performs as well as OpenAI's o1 and DeepSeek's R1 reasoning models. It cost just $50 in cloud compute credits to build. The team reportedly used an off-the-shelf base model, then distilled Google's Gemini 2.0 Flash Thinking Experimental model into it. The process of distilling AIs involves pulling the relevant information to complete a specific task from a larger AI model and transferring it to a smaller one.

Read more
New MediaTek Chromebook benchmark surfaces with impressive speed
Asus Chromebook CX14

Many SoCs are being prepared for upcoming 2025 devices, and a recent benchmark suggests that a MediaTek chipset could make Chromebooks as fast as they have ever been this year.

Referencing the GeekBench benchmark, ChromeUnboxed discovered the latest scores of the MediaTek MT8196 chip, which has been reported on for some time now. With the chip being housed on the motherboard codenamed ‘Navi,’ the benchmark shows the chip excelling in single-core and multi-core benchmarks, as well as in GPU, NPU, and some other tests run.

Read more
Chrome incognito just got even more private with this change
The Chrome browser on the Nothing Phone 2a.

Google Chrome's Incognito mode and InPrivate just became even more private, as they no longer save copied text and media to the clipboard, according to Windows Latest. The changes apply to Windows 11 and 10 users and were rolled out in 2024. However, neither Microsoft nor Google documented it.

Even though this change is not a recent feature, it's odd that neither tech giant thought it was worth mentioning. Previously, the default setting was that when a user saved text or images to the clipboard history, it was synced with Cloud Clipboard on Windows. Moreover, accessing this synced content was as simple as pressing the Windows and V keys, which poses a security risk, especially when using incognito mode.

Read more