Skip to main content

No, ChatGPT isn’t going to cause another GPU shortage

ChatGPT is exploding, and the backbone of its AI model relies on Nvidia graphics cards. One analyst said around 10,000 Nvidia GPUs were used to train ChatGPT, and as the service continues to expand, so does the need for GPUs. Anyone who lived through the rise of crypto in 2021 can smell a GPU shortage on the horizon.

I’ve seen a few reporters build that exact connection, but it’s misguided. The days of crypto-driven-type GPU shortages are behind us. Although we’ll likely see a surge in demand for graphics cards as AI continues to boom, that demand isn’t directed toward the best graphics cards installed in gaming rigs.

Recommended Videos

Why Nvidia GPUs are built for AI

A render of Nvidia's RTX A6000 GPU.
Image used with permission by copyright holder

First, we’ll address why Nvidia graphics cards are so great for AI. Nvidia has bet on AI for the past several years, and it’s paid off with the company’s stock price soaring after the rise of ChatGPT. There are two reasons why you see Nvidia at the heart of AI training: tensor cores and CUDA.

CUDA is Nvidia’s Application Programming Interface (API) used in everything from its most expensive data center GPUs to its cheapest gaming GPUs. CUDA acceleration is supported in machine learning libraries like TensorFlow, vastly speeding training and inference. CUDA is the driving force behind AMD being so far behind in AI compared to Nvidia.

Don’t confuse CUDA with Nvidia’s CUDA cores, however. CUDA is the platform that a ton of AI apps run on, while CUDA cores are just the cores inside Nvidia GPUs. They share a name, and CUDA cores are better optimized to run CUDA applications. Nvidia’s gaming GPUs have CUDA cores and they support CUDA apps.

Tensor cores are basically dedicated AI cores. They handle matrix multiplication, which is the secret sauce that speeds up AI training. The idea here is simple. Multiply multiple sets of data at once, and train AI models exponentially faster by generating possible outcomes. Most processors handle tasks in a linear fashion, while Tensor cores can rapidly generate scenarios in a single clock cycle.

Again, Nvidia’s gaming GPUs like the RTX 4080 have Tensor cores (and sometimes even more than costly data center GPUs). However, for all of the specs Nvidia cards have to accelerate AI models, none of them are as important as memory. And Nvidia’s gaming GPUs don’t have a lot of memory.

It all comes down to memory

A stack of HBM memory.
Wikimedia

“Memory size is the most important,” according to Jeffrey Heaton, author of several books on artificial intelligence and a professor at Washington University in St. Louis. “If you do not have enough GPU RAM, your model fitting/inference simply stops.”

Heaton, who has a YouTube channel dedicated to how well AI models run on certain GPUs, noted that CUDA cores are important as well, but memory capacity is the dominant factor when it comes to how a GPU functions for AI. The RTX 4090 has a lot of memory by gaming standards — 24GB of GDDR6X — but very little compared to a data center-class GPU. For instance, Nvidia’s latest H100 GPU has 80GB of HBM3 memory, as well as a massive 5,120-bit memory bus.

You can get by with less, but you still need a lot of memory. Heaton recommends beginners have no less than 12GB, while a typical machine learning engineer will have one or two 48GB professional Nvidia GPUs. According to Heaton, “most workloads will fall more in the single A100 to eight A100 range.” Nvidia’s A100 GPU has 40GB of memory.

You can see this scaling in action, too. Puget Systems shows a single A100 with 40GB of memory performing around twice as fast as a single RTX 3090 with its 24GB of memory. And that’s despite the fact that the RTX 3090 has almost twice as many CUDA cores and nearly as many Tensor cores.

Memory is the bottleneck, not raw processing power. That’s because training AI models relies on large datasets, and the more of that data you can store in memory, the faster (and more accurately) you can train a model.

Different needs, different dies

Hopper H100 graphics card.
Image used with permission by copyright holder

Nvidia’s gaming GPUs generally aren’t suitable for AI due to how little video memory they have compared to enterprise-grade hardware, but there’s a separate issue here as well. Nvidia’s workstation GPUs don’t usually share a GPU die with its gaming cards.

For instance, the A100 that Heaton referenced uses the GA100 GPU, which is a die from Nvidia’s Ampere range that was never used on gaming-focused cards (including the high-end RTX 3090 Ti). Similarly, Nvidia’s latest H100 uses a completely different architecture than the RTX 40-series, meaning it uses a different die as well.

There are exceptions. Nvidia’s AD102 GPU, which is inside the RTX 4090 and RTX 4080, is also used in a small range of Ada Lovelace enterprise GPUs (the L40 and RTX 6000). In most cases, though, Nvidia can’t just repurpose a gaming GPU die for a data center card. They’re separate worlds.

There are some fundamental differences between the GPU shortage we saw due to crypto-mining and the rise in popularity of AI models. According to Heaton, the GPT-3 model required over 1,000 A100 Nvidia GPUs to trains and about eight to run. These GPUs have access to the high-bandwidth NVLink interconnect as well, while Nvidia’s RTX 40-series GPUs don’t. It’s comparing a maximum of 24GB of memory on Nvidia’s gaming cards to multiple hundreds on GPUs like the A100 with NVLink.

There are some other concerns, such as memory dies being allocated for professional GPUs over gaming ones, but the days of rushing to your local Micro Center or Best Buy for the chance to find a GPU in stock are gone. Heaton summed that point up nicely: “Large language models, such as ChatGPT, are estimated to require at least eight GPUs to run. Such estimates assume the high-end A100 GPUs. My speculation is that this could cause a shortage of the higher-end GPUs, but may not affect gamer-class GPUs, with less RAM.”

Jacob Roach
Former Digital Trends Contributor
Jacob Roach is the lead reporter for PC hardware at Digital Trends. In addition to covering the latest PC components, from…
Perplexity one-ups Gemini and ChatGPT with a fantastic AI freebie
Model picker for Deep Research on Perplexity Model picker for Deep Research on Perplexity

What if you tell an AI chatbot to search the web, look up a certain kind of source, and then create a detailed report based on all the information it has gleaned? Well, Gemini can do it, for $20 a month. Or $200 each month, if you prefer ChatGPT.

Perplexity will do it for free. A few times each day, that is. Perplexity is calling its latest tool, Deep Research. Just like OpenAI. And Google Gemini before it.

Read more
Musk won’t chase OpenAI with his billions as long as it stays non-profit
Elon Musk wearing glasses and staring at the camera.

Elon Musk was one of the founding members of OpenAI, but made a sour exit before ChatGPT became a thing. The billionaire claims he wasn’t happy with the non-profit’s pivot to a profit-chasing business model. A few days ago, Musk submitted a bid to buy OpenAI’s non-profit arm for $97.4 billion, but now says he will pull the offer if the AI giant abandons its for-profit ambitions.

“If (the) OpenAI board is prepared to preserve the charity's mission and stipulate to take the "for sale" sign off its assets by halting its conversion, Musk will withdraw the bid,” says a court filing submitted by the billionaire’s lawyer, as per Reuters.

Read more
Sam Altman thinks GPT-5 will be smarter than him — but what does that mean?
Sam Altman at The Age of AI Panel, Berlin.

Sam Altman did a panel discussion at Technische Universität Berlin last week, where he predicted that ChatGPT-5 would be smarter than him -- or more accurately, that he wouldn't be smarter than GPT-5.

He also did a bit with the audience, asking who considered themselves smarter than GPT-4, and who thinks they will also be smarter than GPT-5.
"I don’t think I’m going to be smarter than GPT-5. And I don’t feel sad about it because I think it just means that we’ll be able to use it to do incredible things. And you know like we want more science to get done. We want more, we want to enable researchers to do things they couldn’t do before. This is the history of, this is like the long history of humanity."
The whole thing seemed rather prepared, especially since he forced it into a response to a fairly unrelated question. The host asked about his expectations when partnering with research organizations, and he replied "Uh... There are many reasons I am excited about AI. ...The single thing I'm most excited about is what this is going to do for scientific discovery."

Read more