Skip to main content

Google strikes back with an answer to OpenAI’s Sora launch

Veo 2 on VideoFX
Google DeepMind

Google’s DeepMind division unveiled its second generation Veo video generation model on Monday, which can create clips up to two minutes in length and at resolutions reaching 4K quality — that’s six times the length and four times the resolution of the 20-second/1080p resolution clips Sora can generate.

Of course, those are Veo 2’s theoretical upper limits. The model is currently only available on VideoFX, Google’s experimental video generation platform, and its clips are capped at eight seconds and 720p resolution. VideoFX is also waitlisted, so not just anyone can log on to try Veo 2, though the company announced that it will be expanding access in the coming weeks. A Google spokesperson also noted that Veo 2 will be made available on the Vertex AI platform once the company can sufficiently scale the model’s capabilities.

Recommended Videos

“Over the coming months, we’ll continue to iterate based on feedback from users,” Eli Collins told TechCrunch, “and [we’ll] look to integrate Veo 2’s updated capabilities into compelling use cases across the Google ecosystem … We expect to share more updates next year.”

Today, we’re announcing Veo 2: our state-of-the-art video generation model which produces realistic, high-quality clips from text or image prompts. 🎥

We’re also releasing an improved version of our text-to-image model, Imagen 3 – available to use in ImageFX through… pic.twitter.com/h6ejHaMUM4

— Google DeepMind (@GoogleDeepMind) December 16, 2024

Veo 2 reportedly holds a number of advantages over its predecessors, including a better understanding of physics (think better fluid dynamics and better illumination/shadowing effects) as well as the capacity to generate “clearer” video clips, in that generated textures and images are sharper and less prone to blurring when moving. The new model also offers improved camera controls, enabling the user to position the virtual camera lens with greater precision than before.

As TechCrunch notes, Veo 2 has not yet perfected the video generation process, though it does appear to hallucinate far less than rivals like Sora, Kling, Movie Gen, or Gen 3 Alpha. “Coherence and consistency are areas for growth,” Collins said. “Veo can consistently adhere to a prompt for a couple minutes, but [it can’t] adhere to complex prompts over long horizons. Similarly, character consistency can be a challenge. There’s also room to improve in generating intricate details, fast and complex motions, and continuing to push the boundaries of realism.”

Google also announced improvements to Imagen 3 on Monday, enabling the commercial image generation model to create “brighter, better-composed” outputs. The model, available on ImageFX, will also offer additional descriptive suggestions based on keywords in the user’s prompt, with each keyword spawning a drop-down menu of related terms.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Perplexity AI: how to use the ‘answer engine’ that’s taking on Google
Talking with Perplexity chatbot on Nothing Phone 2a.

Offering a unique take on web search, Perplexity has been a hit among its users (and a bane to its sources) since its debut last year. It's certainly become one of the most popular new AI tools to check out, perhaps second only to ChatGPT itself, which it's powered by.

Here's how the generative AI "answer engine" works and how to get started on using it.
What is Perplexity AI?
Perplexity AI Digital Trends

Read more
OpenAI’s Sora was leaked in protest over allegations of ‘art washing’
An AI image portraying two mammoths that walk through snow, with mountains and a forest in the background.

OpenAI's unreleased Sora video generation model was leaked Tuesday by a group protesting the company's "art washing" actions, per a post from X user @legit_rumors.

The group, calling themselves Sora PR Puppets, reportedly had gained early access to the Sora API. Through that, they leveraged authentication tokens to create a front-end interface enabling anyone to generate video clips with the model. While the project only remained online for around three hours before Hugging Face (or possibly OpenAI itself) revoked access, several users managed to publish their creations to social media sites.

Read more
Google Gemini arrives on iPhone as a native app
the Google extensions feature on iPhone

Google announced Thursday that it has released a new native Gemini app for iOS that will give iPhone users free, direct access to the chatbot without the need for a mobile web browser.

The Gemini mobile app has been available for Android since February, when the platform transitioned from the older Bard branding. However, iOS users could only access the AI on their phones through either the mobile Google app or via a web browser. This new app provides a more streamlined means of chatting with the bot as well as a host of new (to iOS) features.

Read more