Skip to main content

Meta is building a space-age ‘universal language translator’

When you think of tools infused with artificial intelligence (AI) these days, it’s natural for ChatGPT and Bing Chat to spring to mind. But Facebook owner Meta wants to change that with SeamlessM4T, an AI-powered “universal language translator” that could instantly convert any language in the world into whatever output you want.

Meta describes SeamlessM4T as “the first all-in-one multilingual multimodal AI translation and transcription model.” That’s quite a mouthful, but in simple terms, it means it can convert languages in a range of different ways, such as taking speech audio and switching it into text in a different tongue.

A silhouetted person holds a smartphone displaying the Facebook logo. They are standing in front of a sign showing the Meta logo.
SOPA Images / Getty Images

According to Meta, the tool’s speech recognition and translation features can work in a few different ways:

Recommended Videos

• Speech recognition for nearly 100 languages
• Speech-to-text translation for nearly 100 input and output languages
• Speech-to-speech translation for nearly 100 input languages and 36 output languages (including English)
• Text-to-text translation for nearly 100 languages
• Text-to-speech translation for nearly 100 input languages and 35 output languages (including English)

Meta says this will “allow people to communicate effortlessly through speech and text across different languages.”

Coming soon to Facebook?

Meta's AI translation tool SeamlessM4T converts Spanish-language input into Vietnamese, both in text and audio forms.
Meta

SeamlessM4T is being released under a research license, and Meta states it’s doing this to “allow researchers and developers to build on this work.” As well as that, the metadata of the dataset that was used to train the translation model, called SeamlessAlign, is also being publicly released. This consists of “270,000 hours of mined speech and text alignments,” Meta claims.

However, Meta did not make it clear where these 270,000 hours of “mined speech” have been sourced. Concerns have been raised over the privacy implications of Meta’s work on AI chatbots, while other AI tools have already been caught stealing protected work. There will no doubt be fears that Meta could have done something similar when it trained SeamlessM4T.

There are already other translation tools like Google Translate that can convert text to text and speech to text, but Meta says its own efforts are superior. SeamlessM4T, Meta argues, “reduces errors and delays, increasing the efficiency and quality of the translation process.”

Meta has not said whether the new tool will be integrated into its apps like Facebook and Instagram, but the company did reveal that it aimed to “explore how this foundational model can enable new communication capabilities” in the future. We’ll have to see what that entails.

Please enable Javascript to view this content

Alex Blake
Alex Blake has been working with Digital Trends since 2019, where he spends most of his time writing about Mac computers…
I saw Google’s Gemini AI erase copyright evidence. I am deeply worried
Gemini Advanced on the Google Pixel 9 Pro Fold.

Update: Google has responded to Digital Trends’ queries. The story has been updated with company’s statement below.
The rise of generative AI has been a fairly messy process, especially from fair usage ethics and copyright perspective. AI giants are inking deals with publishers to avoid legal hassles, while at the same time, they are embroiled in copyright tussles in courts in multiple countries.
As the ravenous appetite for training AI on user data grows, we might be in for another ethical conundrum. Multiple users on X and Reddit have shared demonstrations of how Google’s latest Gemini 2.0 series AI model can remove watermarks from copyright-protected images.
Going by the before/after samples of images, it seems Gemini is doing a fairly good job of removing the watermarks. Notably, it’s not only erasing those banner-style watermarks, but also fairly complex overlays with design and stylized text elements.
https://x.com/ajiteshleo/status/1901484196972945872?s=61
The model in question is the Gemini 2.0 Flash, which was released earlier this year and got a reasoning upgrade, as well. It is worth noting here that you can not remove the watermark if you are using the mobile or desktop version of the Gemini 2.0 Flash model. Trying to do so returns a message like this:
“Removing watermarks from images without the explicit permission of the copyright holder is illegal in most countries. It is important to respect the copyright laws and intellectual property rights. If you want to use an image with a watermark, you should contact the copyright holder and ask for permission.”
You can, however, try and remove the watermark from images in the Google AI Studio. Digital Trends successfully removed watermarks from a variety of images using the Gemini 2.0 Flash (Image Generation) Experimental model.
 
It is a violation of local copyright laws and any usage of AI-modified material without due consent could land you in legal trouble. Moreover, it is a deeply unethical act, which is also why artists and authors are fighting in court over companies using their work to train AI models without duly compensating them or seeking their explicit nod.

How are the results?
A notable aspect is that the images produced by the AI are fairly high quality. Not only is it removing the watermark artifacts, but also fills the gap with intelligent pixel-level reconstruction. In its current iteration, it works somewhat like the Magic Eraser feature available in the Google Photos app for smartphones.
Furthermore, if the input image is low quality, Gemini is not only wiping off the watermark details but also upscaling the overall picture. .
https://x.com/kaiju_ya/status/1901099096930496720?s=61
The output image, however, has its own Gemini watermark, although this itself can be removed with a simple crop. There are a few minor differences in the final image produced by Gemini after its watermark removal process, such as slightly different color temperatures and fuzzy surface details in photorealistic shots.

Read more
Google is giving free access to two of Gemini’s best AI features
Gemini Advanced on the Google Pixel 9 Pro Fold.

Google’s Gemini AI has steadily made its way to the best of its software suite, from native Android integrations to interoperability with Workspace apps such as Gmail and Docs. However, some of the most advanced Gemini features have remained locked behind a subscription paywall.
That changes today. Google has announced that Gemini Deep Research will now be available for all users to try, alongside the ability to create custom Gem bots. You no longer need a Gemini Advanced (or Google One AI Premium) subscription to use the aforementioned tools.

The best of Gemini as an AI agent
Deep Research is an agentic tool that takes over the task of web research, saving users the hassle of visiting one web page after another, looking for relevant information. With Deep Research, you can simply put a natural language query as input, and also specify the source, if needed.

Read more
Google’s new Gemma 3 AI models are fast, frugal, and ready for phones
Google Gemma 3 open-source AI model on a tablet.

Google’s AI efforts are synonymous with Gemini, which has now become an integral element of its most popular products across the Worksuite software and hardware, as well. However, the company has also released multiple open-source AI models under the Gemma label for over a year now.

Today, Google revealed its third generation open-source AI models with some impressive claims in tow. The Gemma 3 models come in four variants — 1 billion, 4 billion, 12 billion, and 27 billion parameters — and are designed to run on devices ranging from smartphones to beefy workstations.
Ready for mobile devices

Read more