Skip to main content

Meta’s new AI model can turn text into 3D images in under a minute

an array of 3D generated images made by Meta 3D Gen
Meta

Meta’s latest foray into AI image generation is a quick one. The company introduced its new “3D Gen” model on Tuesday, a “state-of-the-art, fast pipeline” for transforming input text into high-fidelity 3D images that can output them in under a minute.

Recommended Videos

What’s more, the system is reportedly able to apply new textures and skins to both generated and artist-produced images using text prompts.

Per a recent study from the Meta Gen AI research team, 3D Gen will not only offer both high-resolution textures and material maps but support physically-based rendering (PBR) and generative re-texturing capabilities as well.

📣 New research from GenAI at Meta, introducing Meta 3D Gen: A new system for end-to-end generation of 3D assets from text in <1min.

Meta 3D Gen is a new combined AI system that can generate high-quality 3D assets, with both high-resolution textures and material maps end-to-end,… pic.twitter.com/rDD5GzNinY

— AI at Meta (@AIatMeta) July 2, 2024

The team estimates an average inference time of just 30 seconds in creating the initial 3D model using Meta’s 3D AssetGen model. Users can then go back and either refine the existing model texture or replace it with something new, both via text prompts, using Meta 3D TextureGen, a process the company figures should take no more than an additional 20 seconds of inference time.

“By combining their strengths,” the team wrote in its study abstract, “3DGen represents 3D objects simultaneously in three ways: in view space, in volumetric space, and in UV (or texture) space.” The Meta team set its 3D Gen model against a number of industry baselines and compared along a variety of factors including text prompt fidelity, visual quality, texture details and artifacts. By combining the functions of both models, images generated by the integrated two-stage process were picked by annotators over their single-stage counterparts 68% of the time.

Granted, the system discussed in this paper is still under development and not yet ready for public use, but the technical advances that this study illustrates could prove transformational across a number of creative disciplines, from game and film effects to VR applications.

Giving users the ability to not only create but edit 3D-generated content, both quickly and intuitively, could drastically lower the barrier to entry for such pursuits. It’s not hard to imagine the effect this could have on game development, for example.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
A new definition of ‘open source’ could spell trouble for Big AI
Meta AI can generate images within a chat in about five seconds.

The Open Source Initiative (OSI), self-proclaimed steward of the open source definition, the most widely used standard for open-source software, announced an update to what constitutes an "open source AI" on Thursday. The new wording could now exclude models from industry heavyweights like Meta and Google.

"Open Source has demonstrated that massive benefits accrue to everyone after removing the barriers to learning, using, sharing, and improving software systems," the OSI wrote in a recent blog post. "For AI, society needs the same essential freedoms of Open Source to enable AI developers, deployers, and end users to enjoy those same benefits."

Read more
Meta’s next AI model to require nearly 10 times the power to train
mark zuckerberg speaking

Facebook parent company Meta will continue to invest heavily in its artificial intelligence research efforts, despite expecting the nascent technology to require years of work before becoming profitable, company executives explained on the company's Q2 earnings call Wednesday.

Meta is "planning for the compute clusters and data we'll need for the next several years," CEO Mark Zuckerberg said on the call. Meta will need an "amount of compute… almost 10 times more than what we used to train Llama 3," he said, adding that Llama 4 will "be the most advanced [model] in the industry next year." For reference, the Llama 3 model was trained on a cluster of 16,384 Nvidia H100 80GB GPUs.

Read more
Meta unveils Llama 3.1, its biggest and best open source model yet
llama 3.1 logo

Facebook parent company Meta announced the release of its Llama 3.1 open source large language model on Tuesday. The new LLM will be available in three sizes -- 8B, 70B, and 405B parameters -- the latter being the largest open-source AI built to date, which Meta CEO Mark Zuckerberg describes as "the first frontier-level open source AI model."

"Last year, Llama 2 was only comparable to an older generation of models behind the frontier," Zuckerberg wrote in a blog post Tuesday. "This year, Llama 3 is competitive with the most advanced models and leading in some areas. Starting next year, we expect future Llama models to become the most advanced in the industry."

Read more