Skip to main content

I tried out Google’s latest AI tool that generates images in a fun, new way

Google's Whisk AI tool being used with images.
Google

Google’s latest AI tool helps you automate image generation even further. The tool is called Whisk, and it’s based on Google’s latest Imagen 3 image generation model. Rather than relying solely on text prompts, Whisk helps you create your desired images using other images as the base prompt.

Recommended Videos

Whisk is currently in an experimental phase, but once set up it’s fairly easy to navigate. Google detailed in a blog post introducing Whisk that it is intended for “rapid visual exploration, not pixel-perfect edits.”

Exploring the tool has a fast-paced feel, in comparison to other text-based tools, that are more contingent on the details and accuracy of the words to produce an image.

After going through the Welcome page, which lists the important details you should know about how the tool functions, the page asking if you’d like to sign up for emails, and the privacy policy, you’ll load right into the main page of Whisk. I saw a prompt with a dinosaur plushie as the image style, but the other options are an enamel pin and sticker. I just went with the first.

Next, you’re directed to upload an image for the subject. I uploaded a photograph of a smartwatch on my wrist and quickly realized this wasn’t going to work. The third option on the right was in a perpetual loading mode, so I tried again, with a more cartoonish image I found on my hard drive, and this loaded right away into plushie figurines of three mythical creatures.

Google Whisk being used with images uploaded.
Google

Once the image was generated, I was able to go into an editing section, with a text prompt area. Simply using the suggested prompt “the character is eating ice cream,” I generated additional images with the same creatures holding ice cream cones.

Alternatively, you can scroll down below the main prompt section and select start from scratch. This will allow you to upload all of your own images or enter your own text. You can also add additional text from the beginning so that your characters can do an action. If you’re lost for what images to add or text to type, you can click the Inspire Me button, and Whisk will fill in images.

The Google Whisk AI tool being used with images.
Google

The tool also allows you to access a My Library section, where you can view all of the images you’ve created. In this section, you can enable or disable the library if you’d prefer to not save your creations. You can also download images, delete images individually, or delete library data as a whole. Additionally, you can select the prompt input option on each image to see the entire text prompt for the generated image. There is a copy option available for sharing to other tools and programs.

I later discovered Whisk did generate an image blending the plushie and smartwatch images and saved it in My Library. So, my recommendation is, if you have mishaps with the tool, check in your library to see if any images have developed in the background.

The Whisk tool is reminiscent of the Microsoft Designer prompt that allows users to create Funko Pop! figures. As a whole, you can use Microsoft Designer to generate a range of whimsical or realistic images. However, the AI generator which currently uses the DALL-E 3 image generation model developed by OpenAI, runs solely on text prompts.

To experiment, I took the text prompt for the plushie smartwatch to Microsoft Designer. Let’s just say the results were not as detailed and were a little bit haunting, with the results delivering human faces on a watch body instead of a detailed watch face. This suggests that the Imagen 3 model in Whisk can more closely decipher context when analyzing the images than the DALL-E 3 model can when processing text.

As said, Whisk still includes the opportunity to add text prompts, which Google noted is included due to the tool’s potential to “miss the mark,” so you always have the option to fill in prompts when needed.

Fionna Agomuoh
Fionna Agomuoh is a Computing Writer at Digital Trends. She covers a range of topics in the computing space, including…
OpenAI might start watermarking ChatGPT images — but only for free users
OpenAI press image

Everyone has been talking about ChatGPT's new image-generation feature lately, and it seems the excitement isn't over yet. As always, people have been poking around inside the company's apps and this time, they've found mentions of a watermark feature for generated images.

Spotted by X user Tibor Blaho, the line of code image_gen_watermark_for_free seems to suggest that the feature would only slap watermarks on images generated by free users -- giving them yet another incentive to upgrade to a paid subscription.

Read more
Google Gemini’s best AI tricks finally land on Microsoft Copilot
Copilot app for Mac

Microsoft’s Copilot had a rather splashy AI upgrade fest at the company’s recent event. Microsoft made a total of nine product announcements, which include the agentic trick called Actions, Memory, Vision, Pages, Shopping, and Copilot Search. 

A healthy few have already appeared on rival AI products such as Google’s Gemini and OpenAI’s ChatGPT, alongside much smaller players like Perplexity and browser-maker Opera. However, two products that have found some vocal fan-following with Gemini and ChatGPT have finally landed on the Copilot platform. 

Read more
Midjourney’s new image generation model announced to take on OpenAI’s GPT-4o
Midjourney logo on web explore feed.

Even though MidJourney set out to be one of the most promising image generation models in the early days of AI, it appears to have fallen behind more accessible, easy to use, and free tools such Gemini, ChatGPT, and Bing. Adding to its woes is the latest update to OpenAI's GPT-4o model which allows exceptionally good image generation with the ability to recreate real photos and produce immaculate text. So to stay relevant -- or perhaps catch the hype train being shunted by the wave of Studio Ghibli-inspired AI art flooding the internet, MidJourney is rolling out an updated model with several improvements.

CEO David Holz announced details of the new V7 model on MidJourney's official Discord server and through a blog post. They said the new model is "smarter with text prompts" and produces images with "noticeably higher" quality and "beautiful textures."

Read more