Skip to main content

Microsoft’s new bot can draw a photo-realistic bird based on text descriptions

Microsoft
Image used with permission by copyright holder

Microsoft’s research labs created a new artificial intelligence, or bot, that can draw any image you want based on simple descriptions. The company says this bot can draw anything in pixel form stemming from caption-like text descriptions you provide. And although text-to-image creation isn’t anything new, Microsoft’s “drawing bot” focuses on captions as image descriptors to produce an image quality that is claimed to be three times better than other state-of-the-art technologies.  

“The technology, which the researchers simply call the drawing bot, can generate images of everything from ordinary pastoral scenes, such as grazing livestock, to the absurd, such as a floating double-decker bus,” Microsoft states. “Each image contains details that are absent from the text descriptions, indicating that this artificial intelligence contains an artificial imagination.” 

Recommended Videos

Microsoft’s drawing bot merges two components of artificial intelligence: Natural-language processing and computer vision. The research project started with a bot that could generate text captions from photos. The researchers then advanced the project to answer human-generated questions about images, such as identifying a location, the object in focus, and so on. 

Please enable Javascript to view this content

But actually drawing an image is a huge step. While the bot can generate components based on text descriptors, it must “imagine” all the other missing pieces of the picture. Thus, if you tell the bot to draw a yellow bird with black wings, it has four descriptors, but must pull the remaining parts from data it acquired from previous drawings, photos, and more. In other words, knowledge obtained through machine-based learning. 

Microsoft’s bot relies on a generative adversarial network (GAN). Just imagine two teams of computers: One side must render an image to fool the other team into believing it’s an actual photograph. Both teams go back and forth, with the first saying the image is real, and the second saying “nuh-uh,” disproving the claim. The goal, obviously, is to render an image that finally fools the second team. 

In this case, the first team renders an image derived from text-based descriptions and the second team will disprove its “authenticity” as an actual photograph until the first team correctly renders the image. Microsoft first fed its GAN with paired images and captions so that it could understand that it needs to draw a bird based on that single word. 

From there, Microsoft continued to build the knowledge base with paired images and captions consisting of multiple traits, such as black wings and a red belly. But Microsoft says it’s not using just any GAN, but one that targets tiny details so the bot can produce photo-realistic results. Microsoft dubs it as an attentional GAN, or AttnGAN. 

“As humans draw, we repeatedly refer to the text and pay close attention to the words that describe the region of the image we are drawing,” the company says. “[AttnGAN] does this by breaking up the input text into individual words and matching those words to specific regions of the image.” 

You can read Microsoft’s research paper describing its AttnGAN here. 

Kevin Parrish
Former Digital Trends Contributor
Kevin started taking PCs apart in the 90s when Quake was on the way and his PC lacked the required components. Since then…
We now know why AMD chose to delay RDNA 4 — well, kind of
AMD announcing FSR 4 during CES 2025.

AMD hasn't been very forthcoming when it comes to information about its RX 9000 series GPUs, but we just got an update as to why the cards won't be available until sometime in March. The company cites software optimization and FSR 4 as the two reasons why it most likely decided to delay the launch of RDNA 4. But is that all there is to it, or is AMD waiting to see some of Nvidia's best graphics cards before pulling the trigger on the RX 9070 XT?

The update comes from David McAfee, AMD's vice president and general manager of the Ryzen CPU and Radeon graphics division. A couple of days ago, McAfee took to X (Twitter) to announce that AMD was excited to launch the RX 9000 series in March. This caused a bit of an uproar, with many enthusiasts wondering why AMD was choosing to wait so long.

Read more
What power supply do you need for the RTX 5090 and RTX 5080?
The RTX 5090 sitting on top of the RTX 4080.

Nvidia’s new RTX 50-series GPUs represent a leap forward in gaming and content creation, but they also push the boundaries of what’s expected from your power supply. The RTX 5090 and RTX 5080, will be the first two models available for purchase starting January 30, and are expected to deliver improved performance over its predecessors -- you can already see that in action in our RTX 5090 review.

However, with great power comes greater demands on your power supply. If you're planning to upgrade to either of these next-generation graphics cards, it’s crucial to know what kind of PSU (Power Supply Unit) you need. Ensuring your PSU meets or exceeds the recommended specifications is critical for avoiding crashes, ensuring system stability, and maintaining long-term reliability.

Read more
Gaming mouse goes up in flames, nearly causes apartment fire
A burned Gigabyte moue as posted by a user on Reddit

Think you have one of the best gaming mice? Think again. A Reddit user recently reported a concerning incident involving their Gigabyte M6880X gaming mouse, which allegedly caught fire spontaneously, filling their apartment with black smoke and causing significant property damage.

The user who goes by the unser name lommelinn, shared images showing the melted mouse, burn marks on the desk, and a destroyed mouse pad. They recounted discovering the device "burning with large flames," which they quickly extinguished. Despite their swift action, the room was left covered in black particles, affecting other equipment, including a modular synthesizer.

Read more