Skip to main content

Forget text-to-image; this AI makes videos from your prompts

You’ve likely heard about the amazing results realized by text-to-image AI such as Dall-E, Stable Diffusion, and Midjourney. As you might have expected, the revolution is marching onward, with the next target being text-to-video AI tools.

QuickVid generated this video about a DJI Drone and astronauts on Mars.

Google and Meta have teased their text-to-video capabilities in research reports from their AI labs, but this advanced technology hasn’t been available to the public. If you’ve been eagerly awaiting the chance to try creating entire videos with a simple AI prompt, now’s your chance, thanks to QuickVid.

Before your expectations climb too high, it’s important to realize that this isn’t equivalent to generating thousands of Stable Diffusion stills and assembling them to create a video or getting access to the most advanced AI systems in the world for true video generation. This is a very early entry into the race for a text-to-video solution.

The first step of the process for the AI is to generate a script based on your prompt. I tested the system by creating a YouTube Short from these words: “A video of a DJI drone flying over an astronaut on Mars, ending with a reaction shot of the surprised astronaut.”

The AI wrote a complete, 79-word narrative from my prompt, then synthesized the speech with a choice of a male or female voice. TechCrunch pointed out that the background video chosen for the generated video is taken from a stock library and there was apparently plenty of footage of “astronauts on Mars.”

As a questionable finishing touch, QuickVid overlays the script as titles and adds thumbnail images generated by the Dall-E API. The resulting YouTube short seen above is … interesting. Perhaps, it would handle more earthly videos better.

In a TechCrunch interview, the developer of QuickVid said improvements are coming, with more personalization options arriving in January. Eventually, QuickVid will also include captions and support avatars.

Next year could see many more text-to-video solutions arrive, along with other visual wonders such as AR glasses and more advanced VR headsets. It should be exciting.

Editors' Recommendations

Alan Truly
Computing Writer
Alan is a Computing Writer living in Nova Scotia, Canada. A tech-enthusiast since his youth, Alan stays current on what is…
Bing Image Creator brings DALL-E AI-generated images to your browser
Bing Image Creator being used in the Edge sidebar.

Microsoft isn't slowing down its momentum in generative AI. Just a month since it launched the ChatGPT-based Bing Chat, the company is now introducing Bing Image Creator, which brings text-to-image generation right to your browser.

Bing Image Creator lets you create images from text using DALL-E, which is OpenAI's own text-to-image AI model. Microsoft says it's using "an advanced" version of DALL-E, though the company didn't provide specifics about how it was different than the current DALL-E 2 model. This isn't dissimilar, though, to how Bing Chat was announced, which had been running on GPT-4 before the new model had even been announced.

Read more
Forget Dall-E, you can sign up to create AI-generated videos now
A frame from an AI-generated video in claymation style.

Dall-E, ChatGPT, and other AI-generation technologies continue to amaze us. Still, AI image-generation tools like Midjourney might seem boring once you see the new, AI-powered video-generation abilities that will soon be available to us all.

Runway provides an advanced online video editor that offers many of the same features as a desktop app. The company has distinguished its service from others, however, by pioneering the use of AI tools that help with various time-consuming video chores, such as masking out the background.

Read more
AI image generators appear to propagate gender and race stereotypes
AI image generators are being tested for various biases that might come up in their machine learning systems.

Experts have claimed that popular AI image generators such as Stable Diffusion are not so adept at picking up on gender and cultural biases when using machine learning algorithms to create art.

Many text-to-art generators allow you to input phrases and draft up a unique image on the other end. However, these generators can often be based on stereotypical biases, which can affect how machine learning models manufacture images Images can often be Westernized, or show favor to certain genders or races, depending on the types of phrases used, Gizmodo noted.

Read more