Skip to main content

Programmer trains artificial intelligence to draw faces from text descriptions

T2F training time lapse

Programmer Animesh Karnewar wanted to know how characters described in books would appear in reality, so he turned to artificial intelligence to see if it could properly render these fictional people. Called T2F, the research project uses a generative adversarial network (GAN) to encode text and synthesize facial images.

Recommended Videos

Simply put, a GAN consists of two neural networks that argue with each other to produce the best results. For example, the job of network No. 1 is to fool network No. 2 into believing a rendered image is a real photograph while network No. 2 sets out to prove the alleged photo is just a rendered image. This back-and-forth process fine-tunes the rendering process until network No. 2 is eventually fooled.

Please enable Javascript to view this content

Karnewar started the project using a dataset called Face2Text provided by researchers at the University of Copenhagen, which contains natural language descriptions for 400 random images.

“The descriptions are cleaned to remove reluctant and irrelevant captions provided for the people in the images,” he writes. “Some of the descriptions not only describe the facial features, but also provide some implied information from the pictures.”

While the results stemming from Karnewar’s T2F project aren’t exactly photorealistic, it’s a start. The video embedded above shows a time-lapsed view of how the GAN was trained to render illustrations from text, starting with solid blocks of color and ending with rough but identifiable pixilated renderings.

“I found that the generated samples at higher resolutions (32 x 32 and 64 x 64) has more background noise compared to the samples generated at lower resolutions,” Karnewar explains. “I perceive it due to the insufficient amount of data (only 400 images).”

The technique used to train the adversarial networks is called “Progressive Growing of GANs,” which improves quality and stability over time. As the video shows, the image generator starts from an extremely low resolution. New layers are slowly introduced into the model, increasing the details as the training progresses over time.

“The Progressive Growing of GANs is a phenomenal technique for training GANs faster and in a more stable manner,” he adds. “This can be coupled with various novel contributions from other papers.”

Image used with permission by copyright holder

In a provided example, the text description illustrates a woman in her late 20s with long brown hair swiped over to one side, gentle facial features and no make-up. She’s “casual” and “relaxed.” Another description illustrates a man in his 40s with an elongated face, a prominent nose, brown eyes, a receding hairline and a short mustache. Although the end results are extremely pixelated, the final renders show great progress in how A.I. can generate faces from scratch.

Karnewar says he plans to scale out the project to integrate additional datasets such as Flicker8K and Coco captions. Eventually, T2F could be used in the law enforcement field to identify victims and/or criminals based on text descriptions, among other applications. He’s open to suggestions and contributions to the project.

To access the code and contribute, head to Karnewar’s repository on Github here.

Kevin Parrish
Former Digital Trends Contributor
Kevin started taking PCs apart in the 90s when Quake was on the way and his PC lacked the required components. Since then…
Apple Arcade just turned six. As a Mac gamer, I’m losing hope in it
A character from Katamari Damacy Rolling LIVE on Apple Arcade.

Remember Apple Arcade? The gaming subscription service from Apple turns six years old today, but you might not have realized that -- or even recalled that it still exists. As a Mac gamer, I feel like it’s Apple’s forgotten gaming platform.

These days, Apple Arcade barely gets a mention from its creator, whether that’s at a showy Apple event or in a quick press release. There’s practically no significant promotion for Apple Arcade, wherever you look. And that feels odd considering how much Apple is pushing Mac gaming these days.

Read more
This budget-friendly gaming keyboard is $34 in Amazon’s Big Spring Sale
The Redragon S101 gaming keyboard and M601 gaming mouse on a white background.

After purchasing a powerful machine from gaming PC deals and upgrading your screen with monitor deals, the next step for gamers is to get a decent keyboard and mouse. If you've already used up most of your budget, don't worry because there are low-priced options like this bundle that includes the Redragon S101 gaming keyboard and Redragon M601 gaming mouse. The package is already affordable at its original price of $40, but Amazon pulled it down further to just $34 as part of its Big Spring Sale 2025, for a 15% discount that translates to extra savings of $6. The event will run until March 31, but there's no assurance that stocks will last that long. Buy this gaming keyboard and gaming mouse bundle while you still can.

Why you should buy the Redragon S101 gaming keyboard and M601 gaming mouse bundle
The Redragon S101 isn't going to challenge the high-end features of the best gaming keyboards, but it's a pretty solid option for its price. The 114-key gaming keyboard is quiet and durable, and it comes with a gold-plated, corrosion-free USB connector for a reliable connection to your gaming PC. You can choose between seven RGB lighting modes and effects, plus four backlight brightness levels to match your preferred aesthetic. The ergonomic design makes the gaming keyboard a decent productivity tool as well when you need to use it for work or school.

Read more
Google launches Gemini 2.5 Pro, its ‘most intelligent AI model’ yet
Google's Gemini logo with the AI running on a smartphone and a PC.

In a blog post today, Google announced Gemini 2.5 Pro (experimental) for developers and Advanced subscribers, aiming to help you tackle increasingly complex problems. It's the first in the family and set up to "think" before it speaks.

Google says it'll be available today in Google AI Studio (its developer platform) and for Advanced subscribers, with Vertex AI support coming soon. Google also claims to outperform the competition, and that Gemini 2.5 Pro takes the number one spot on the LMArena leaderboard with 18.8%, surpassing other AI models such as ChatGPT and Deepseek.

Read more