Skip to main content

DALL-E 3 could take AI image generation to the next level

DALL-E 2DALL-E 2 Image on OpenAI.
OpenAI

OpenAI might be preparing the next version of its DALL-E AI text-to-image generator with a series of alpha tests that have now been leaked to the public, according to the Decoder.

An anonymous leaker on Discord shared details about his experience, having access to the upcoming OpenAI image model being referred to as DALL-E 3. He first appeared in May, telling the interest-based Discord channel that he was part of an alpha test for OpenAI, trying out a new AI image model. He shared the images he generated at the time.

We've NEVER seen Image Generation This Good! | SNEAK PEAK

The May alpha test version had the ability to generate images of multiple aspect ratios inside the image model. YouTuber, MattVidPro AI then showcased several of the images that were generated in a 16:9 aspect ratio. This version also showed the model’s prowess for high-quality text production, which continues to be a pain point for rival models, even for top generators such as Stable Diffusion and Midjourney.

Please enable Javascript to view this content

Some examples showcased images, such as text melded into a brick wall, a neon sign of words, a billboard sign in a city, a cake decoration, and a name etched into a mountain. The model maintains that DALL-E is good at generating people. One such image displayed a woman eating spaghetti at a party from a fisheye point of view.

The leaker returned to the Discord channel in mid-July with more details and new images. He claimed to be a part of a “closed alpha” test version that included approximately 400 subjects. He added that he was invited to the trial via email and was also included in the testing of the original DALL-E and DALL-E 2. This is what led to the conclusion that the alpha test might be for DALL-E 3, though it has not been confirmed.

The model has been updated considerably between May and July. The leaker has showcased this by sharing images generated based on the same prompt, showing how powerful DALL-E 3 has gotten over time. The prompt reads a painting of a pink jester giving a high five to a panda while in a cycling competition. The bikes are made of cheese and the ground is very muddy. They are driving in a foggy forest. The panda is angry.

The May alpha produces the general scene that hits most of the points of the prompt. There’s a little distortion in the hands connecting, and the wheels of the bikes are yellow as opposed to being made of cheese. However, the July alpha is far more detailed, with the pink jester and the panda clearly high-fiving and the bicycle wheels made of cheese in several generations.

Meanwhile, in Midjourney, the jester is missing from the scene, the pandas are on motorcycles instead of bicycles. There are roads, instead of mud. The pandas are happy instead of angry.

There are a host of DALL-E 3 July alpha image examples that show the potential of the model. However, with the alpha test being uncensored, the leaker noted that also has the potential to generate scenes of “violence and nudity or copyrighted material such as company logos.”

Some examples include a gory anime girl, a Game of Thrones character, a Grand Theft Auto V cover, a zombie Jesus eating a Subway sandwich, also suggesting mild gore, and Shrek being dug up from an archeological dig, among others.

MattVidPro AI noted that the image model generates images as if they’re supposed to be in a specific style.

DALL-E 2 launched in April 2022 but was heavily regulated with a waitlist due to its popularity and concerns about ethics and safety. The AI image generator became accessible to the public in September 2022.

Fionna Agomuoh
Fionna Agomuoh is a Computing Writer at Digital Trends. She covers a range of topics in the computing space, including…
A first look at Adobe’s new AI video generation tools
Adobe Creative Cloud Suite apps list.

Adobe previewed its upcoming video AI tools, part of the Firefly video model the company announced in April, in a newly released YouTube post. The features (and model) are expected to arrive by the end of the year and be available on both the Premiere Pro beta app, as well as on a free website.

The company highlighted three new features that are currently in private beta but will be ready for public release later this year: Generative Extend, Text to Video, and Image to Video. Generative Extend will lengthen any input video by up to two seconds, while the Text and Image to Video functions allow users to generate high-definition, five-second-long clips using word and picture prompts.

Read more
A new definition of ‘open source’ could spell trouble for Big AI
Meta AI can generate images within a chat in about five seconds.

The Open Source Initiative (OSI), self-proclaimed steward of the open source definition, the most widely used standard for open-source software, announced an update to what constitutes an "open source AI" on Thursday. The new wording could now exclude models from industry heavyweights like Meta and Google.

"Open Source has demonstrated that massive benefits accrue to everyone after removing the barriers to learning, using, sharing, and improving software systems," the OSI wrote in a recent blog post. "For AI, society needs the same essential freedoms of Open Source to enable AI developers, deployers, and end users to enjoy those same benefits."

Read more
People are making entire short films with this new AI video-generation app
screenshot of a MiniMax AI video of a dog running through a field

Alibaba- and Tencent-backed startup Minimax, one of China's "AI tigers," has released its Video-01 text-to-video model, which can generate highly accurate depictions of humans, down to their hand motions. Minimax unveiled the new tool Saturday at its inaugural developer conference in Shanghai.

https://x.com/JunieLauX/status/1829950412340019261

Read more