Skip to main content

The 10 announcements that made 2024 a landmark year for AI

ChatGPT and Siri integration on iPhone.
Siri is currently piggybacking on ChatGPT. Nadeem Sarwar / Digital Trends

We’ve officially passed the second anniversary of the start of the AI boom, and things haven’t slowed down. Just the opposite. Generative AI is ramping up at a pace that feels nearly overwhelming, expanding into new platforms, mediums, and even devices at a relentless pace.

Here are the 10 announcements that made 2024 a monumental year in the world of AI.

Recommended Videos

OpenAI releases GPT-4o

Mira Murati announcing GPT-4o.
OpenAI

When ChatGPT (running GPT-3.5) first arrived in November 2022, it was basically a fancy, computer-controlled game of Mad Libs. Don’t get me wrong, even that capability was revolutionary at the time, but it wasn’t until the release of GPT-4o in May 2024 that generative AI systems truly came into their own.

Building on its predecessor’s ability to analyze and generate both text and images, GPT-4o provides a more comprehensive contextual understanding compared to GPT-4 alone. This translates to better performance in everything from image captioning and visual analysis, to generating both creative and analytical content like graphs, charts, and images.

Advanced Voice Mode helps computers speak like humans

ChatGPT Advanced Voice Mode Desktop app
OpenAI

In September, OpenAI once again showed why it is the leading artificial intelligence firm by releasing its Advanced Voice Mode to ChatGPT subscribers. This feature eliminated the need for users to type their questions into a prompt window, instead enabling them to converse with the AI as they would another person.

Leveraging GPT-4o’s human-equivalent response times, Advanced Voice Mode fundamentally changed how people can interact with machine intelligence and helped users unleash the AI’s full creative capacity.

Generative AI comes to the edge

Using Visual Intelligence on an iPhone 16 Pro showing ChatGPT answer.
Visual Intelligence on iPhones relies on the camera to make sense of the world around. Christine Romero-Chan / Digital Trends

When ChatGPT debuted in 2022, it was the only AI in town and available in precisely one place: ChatGPT.com. Oh, what a difference two years makes. These days, you can find generative AI in everything from smartphones and smart home devices to autonomous vehicles and health-monitoring gadgets. ChatGPT, for example, is available as a desktop app, an API, a mobile app, and even via an 800 number. Microsoft, for its part, has integrated AI directly into its line of Copilot+ laptops.

Perhaps the most significant example, of course, is Apple Intelligence. It might not have been the most successful launch (many of the features we are still waiting for), but in terms of making the powers of generative AI as accessible as possible, nothing was as important as Apple Intelligence.

Now, neither Copilot+ PCs or Apple Intelligence panned out how the companies involved probably wanted — especially for Microsoft — but as we all know, this is only the beginning.

The resurgence of nuclear power production

three Mile island
Constellation Energy

Before this year, nuclear power was seen as a losing proposition in America. Deemed unreliable and unsafe, due in large part to the Three Mile Island incident of 1979 in which one of the plant’s primary reactors partially melted down and spewed toxic, radioactive material into the atmosphere. However, with the rapidly increasing amounts of electrical power that modern large language models require — and the massive stress they place on regional power grids — many leading AI firms are taking a closer look at running their data centers using the power of the atom.

Amazon, for example, purchased a nuclear-powered AI data center from Talen in March, then signed an agreement to acquire miniaturized, self-contained Small Modular Reactors (SMRs) from Energy Northwest in October. Microsoft, not to be outdone, has purchased the production capacity of Three Mile Island itself and is currently working to get Reactor One back online and generating electricity.

Agents are poised to be the next big thing in generative AI

glasses and chatgpt
Matheus Bertelli / Pexels

Turns out, there’s only so much training data, power, and water you can throw at the task of growing your large language model until you run into the issue of diminishing returns. The AI industry experienced this firsthand in 2024 and, in response, has begun to pivot away from the massive LLMs that originally defined the generative AI experience in favor of Agents; smaller, more responsive models designed to perform specific tasks, rather than try to do everything a user might ask of it.

Anthropic debuted its agent, dubbed Computer Use, in October. Microsoft followed suit with Copilot Actions in November, while OpenAI is reportedly set to release its agent feature in January.

The rise of reasoning models

The openAI o1 logo
OpenAI

Many of today’s large language models are geared more toward generating responses as quickly as possible, often at the expense of accuracy and correctness. OpenAI’s o1 reasoning model, which the company released as a preview in September and as a fully functional model in December, takes the opposite approach: It sacrifices response speed to internally verify its rationale for a given answer, ensuring that it is as accurate and complete as possible.

While this technology has yet to be fully embraced by the public (o1 is currently only available to Plus and Pro tier subscribers), leading AI companies are pressing ahead with versions of their own. Google announced its answer to o1, dubbed Gemini 2.0 Flash Thinking Experimental, on December 19, while OpenAI revealed that it is already working on o1’s successor, which it calls o3, during its 12 Days of OpenAI live-stream event on December 20.

AI-empowered search spreads across the internet

Perplexity AI app running on an iPhone 14 Pro.
Joe Maring / Digital Trends

Generative AI is seemingly everywhere these days, so why wouldn’t it be integrated into one of the internet’s most basic features? Google has been toying with the technology for the past two years, first releasing the Search Generative Experience in May of 2023 before rolling out its AI Overview feature this past May. AI Overview generates a summary of the information that a user requests at the top of its search results page.

Perplexity AI takes that technique a step further. Its “answer engine” scours the internet for the information a users requests, then synthesizes that data into a coherent, conversational (and cited) response, effectively eliminating the need to click through a list of links. OpenAI, ever the innovator, developed a nearly identical system for its chatbot, dubbed ChatGPT Search, which it debuted in October.

Anthropic’s Artifact kicks off a collaborative revolution

The Anthropic logo on a red background.
Anthropic

Trying to generate, analyze, and edit large files — whether they’re long-form creative essays or computer code snippets — directly within the chat stream can be overwhelming, requiring you to endlessly scroll back and forth to view the entirety of the document.

Anthropic’s Artifacts feature, which debuted in June, helps mitigate that issue by providing users with a separate preview window in which to view the AI-crafted text outside of the main conversation. The feature proved to be such a hit that OpenAI quickly followed suit with its own version.

Its latest models and features have developed Anthropic into a formidable opponent to OpenAI and Google this year, which alone feels significant.

Image and video generators finally figure out fingers

Use Camera Control to direct every shot with intention.

Learn how with today's Runway Academy. pic.twitter.com/vCGMkkhKds

— Runway (@runwayml) November 2, 2024

Used to be that spotting an AI generated image or video was as simple as counting the number of appendages the subject shows — anything more than two arms, two legs, and 10 fingers were obviously generated, as Stable Diffusion 3’s Cronenberg-esque images demonstrated in June. Yet, as 2024 comes to a close, differentiating between human and machine-made content has become significantly more difficult as image and video generators have rapidly improved both the quality and physiological accuracy of their outputs.

AI video systems like Kling, Gen 3 Alpha, and Movie Gen are now capable of generating photorealistic clips with minimal distortion and fine-grain camera control, while the likes of  Midjourney, Dall-E 3, and Imagen 3 can craft still images with a startling degree of realism (and minimal hallucinated artifacts) in myriad artistic styles.

Oh yeah, and OpenAI’s Sora finally made its debut as part of its December announcements. The battle for AI-generated video models is heating up, and they got shockingly impressive in 2024.

Elon Musk’s $10 billion effort to build the world’s biggest AI training cluster

Elon Musk at Tesla Cyber Rodeo.
Digital Trends

xAI launched Grok 2.0 this year, the latest model built right into X. But the bigger news around Elon Musk’s AI venture is around where this headed in the future. In 2024, Elon Musk set about constructing the “world’s largest supercomputer” just outside of Memphis, Tennessee, which came online at 4:20 a.m. on July 22. Driven by 100,000 Nvidia H100 GPUs, the supercluster is tasked with training new versions of xAI’s Grok generative AI model, which Musk claims will become “the world’s most powerful AI.”

Musk is expected to spend around $10 billion in capital and inference costs in 2024 alone but is reportedly working to double the number of GPUs powering the supercomputer in the new year.

Andrew Tarantola
Former Digital Trends Contributor
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
ChatGPT’s latest model is finally here — and it’s free for everyone
OpenAI's ChatGPT blog post is open on a computer monitor, taken from a high angle.

We knew it was coming but OpenAI has made it official and released its o3-mini reasoning model to all users. The new model will be available on ChatGPT starting Friday, though your level of access will depend on your level of subscription.

OpenAI first teased the o3 model family on the finale of its 12 Days of OpenAI livestream event in December (less than two weeks after debuting its o1 reasoning model family). CEO Sam Altman shared additional details on the o3-mini model in mid-January and later announced that the model would be made available to all users as part of the ChatGPT platform. He appears to have delivered on that promise.

Read more
Chatbots are going to Washington with ChatGPT Gov
glasses and chatgpt

In an X post Monday commenting on DeepSeek's sudden success, OpenAI CEO Sam Altman promised to "pull up some releases" and it appears he has done so. OpenAI unveiled its newest product on Tuesday, a "tailored version of ChatGPT designed to provide U.S. government agencies with an additional way to access OpenAI’s frontier models," per the announcement post. ChatGPT Gov will reportedly offer even tighter data security measures than ChatGPT Enterprise, but how will it handle the hallucinations that plague the company's other models?

According to OpenAI, more than 90,000 federal, state, and local government employees across 3,500 agencies have queried ChatGPT more than 18 million times since the start of 2024. The new platform will enable government agencies to enter “non-public, sensitive information” into ChatGPT while it runs within their secure hosting environments -- specifically, the Microsoft Azure commercial cloud or Azure Government community cloud -- and cybersecurity frameworks like IL5 or CJIS. This enables each agency to "manage their own security, privacy and compliance requirements,” Felipe Millon, Government Sales lead at OpenAI told reporters on the press call Tuesday.

Read more
DeepSeek: everything you need to know about the AI that dethroned ChatGPT
robot hand in point space

A year-old startup out of China is taking the AI industry by storm after releasing a chatbot which rivals the performance of ChatGPT while using a fraction of the power, cooling, and training expense of what OpenAI, Google, and Anthropic's systems demand. Here's everything you need to know about Deepseek's V3 and R1 models and why the company could fundamentally upend America's AI ambitions.
What is DeepSeek?
DeepSeek (technically, "Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd.") is a Chinese AI startup that was originally founded as an AI lab for its parent company, High-Flyer, in April, 2023. That May, DeepSeek was spun off into its own company (with High-Flyer remaining on as an investor) and also released its DeepSeek-V2 model. V2 offered performance on par with other leading Chinese AI firms, such as ByteDance, Tencent, and Baidu, but at a much lower operating cost.

The company followed up with the release of V3 in December 2024. V3 is a 671 billion-parameter model that reportedly took less than 2 months to train. What's more, according to a recent analysis from Jeffries, DeepSeek's “training cost of only US$5.6m (assuming $2/H800 hour rental cost). That is less than 10% of the cost of Meta’s Llama.” That's a tiny fraction of the hundreds of millions to billions of dollars that US firms like Google, Microsoft, xAI, and OpenAI have spent training their models.

Read more