Skip to main content

Amazon unveils its new family of Nova foundational models

pasta city
AWS

Amazon CEO Andy Jassy took to the stage at the company’s re:Invent conference on Tuesday to show off six new text, image, and video generation models that it calls Amazon Nova.

This new family of multimodal generative AIs includes Nova Micro, a text-only model built for low-cost, low-latency responses; Nova Lite, a low-cost multimodal model for processing image, video, and text inputs; and Nova Pro, its general purpose multimodal model that combines “accuracy, speed, and cost for a wide range of tasks,” per the company’s announcement post. Nova Premier is Amazon’s “most capable … multimodal models for complex reasoning tasks,” while Nova Canvas is a dedicated text-to-image engine and Nova Reel is purpose-built to generate video.

Recommended Videos

The text-based models have been optimized on 15 different languages. Micro offers a 128,000-token context window while both Lite and Pro can handle up to 300,000 tokens (around 225,000 words or 30 minutes of video). The company plans to expand the context windows of its larger models up to 2 million tokens by early next year. 

Canvas enables users to generate and edit images using natural language prompts. Reels, which will compete with the likes of Gen-3 Alpha, Kling, and Dall-E 3, can generate clips up to six seconds in length from both text prompts and reference images. The video generator also offers camera motion control including pans and zooms.

Pasta City, created with Amazon Nova Reel by Amazon Ads

“We’ve continued to work on our own frontier models,” Jassy told the assembled crowd, “and those frontier models have made a tremendous amount of progress over the last four to five months. And we figured, if we were finding value out of them, you would probably find value out of them.”

Jassy also says that these models are both among the least expensive to operate and fastest in their class, though the company has yet to post benchmark data supporting those claims. “We’ve optimized these models to work with proprietary systems and APIs, so that you can do multiple orchestrated automatic steps — agent behavior — much more easily with these models,” he said. “So I think these are very compelling.”

The Micro, Lite, and Pro models (as well as Canvas and Reels) are all currently available to AWS customers. Premiere is set to arrive in Q1 2025.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
Yes, it’s real: ChatGPT has its own 800 number
1-800-chatgpt

On the 10th of its "12 Days of OpenAI" media event, the company announced that it has set up an 800 number (1-800-ChatGPT, of course) where anyone in the U.S. with a phone line can dial in and speak with the AI via Advanced Voice Mode. Because why not.

“[The goal of] OpenAI is to make artificial general intelligence beneficial to all of humanity, and part of that is making it as accessible as possible to as many people as we can,” the company's chief product officer, Kevin Weil, said during the Wednesday live stream. “Today, we’re taking the next step and bringing ChatGPT to your telephone.”

Read more
OpenAI opens up developer access to the full o1 reasoning model
The openAI o1 logo

On the ninth day of OpenAI's holiday press blitz, the company announced that it is releasing the full version of its o1 reasoning model to select developers through the company's API. Until Tuesday's news, devs could only access the less-capable o1-preview model.

According to the company, the full o1 model will begin rolling out to folks in OpenAI's "Tier 5" developer category. Those are users that have had an account for more than a month and who spend at least $1,000 with the company. The new service is especially pricey for users (on account of the added compute resources o1 requires), costing $15 for every (roughly) 750,000 words analyzed and $60 for every (roughly) 750,000 words generated by the model. That's three to four times the cost of performing the same tasks with GPT-4o.

Read more
I tried out Google’s latest AI tool that generates images in a fun, new way
Google's Whisk AI tool being used with images.

Google’s latest AI tool helps you automate image generation even further. The tool is called Whisk, and it's based on Google’s latest Imagen 3 image generation model. Rather than relying solely on text prompts, Whisk helps you create your desired images using other images as the base prompt.

Whisk is currently in an experimental phase, but once set up it's fairly easy to navigate. Google detailed in a blog post introducing Whisk that it is intended for “rapid visual exploration, not pixel-perfect edits.”

Read more