Skip to main content

Nvidia just released an open-source LLM to rival GPT-4

Nvidia CEO Jensen in front of a background.
Nvidia

Nvidia, which builds some of the most highly sought-after GPUs in the AI industry, has announced that it has released an open-source large language model that reportedly performs on par with leading proprietary models from OpenAI, Anthropic, Meta, and Google.

The company introduced its new NVLM 1.0 family in a recently released white paper, and it’s spearheaded by the 72 billion-parameter NVLM-D-72B model. “We introduce NVLM 1.0, a family of frontier-class multimodal large language models that achieve state-of-the-art results on vision-language tasks, rivaling the leading proprietary models (e.g., GPT-4o) and open-access models,” the researchers wrote.

Recommended Videos
Get your weekly teardown of the tech behind PC gaming
Check your inbox!

The new model family is reportedly already capable of “production-grade multimodality,” with exceptional performance across a variety of vision and language tasks, in addition to improved text-based responses compared to the base LLM that the NVLM family is based on. “To achieve this, we craft and integrate a high-quality text-only dataset into multimodal training, alongside a substantial amount of multimodal math and reasoning data, leading to enhanced math and coding capabilities across modalities,” the researchers explained.

The result is an LLM that can just as easily explain why a meme is funny as it can solve complex mathematics equations, step by step. Nvidia also managed to increase the model’s text-only accuracy by an average of 4.3 points across common industry benchmarks, thanks to its multimodal training style.

screenshot of the NVLM white paper explaining the process of explaining why a meme is funny
Nvidia

Nvidia appears serious about ensuring that this model meets the Open Source Initiative’s newest definition of “open source” by not only making its training weights available for public review, but also promising to release the model’s source code in the near future. This is a marked departure from the actions of rivals like OpenAI and Google, who jealously guard the details of their LLMs’ weights and source code. In doing so, Nvidia has positioned the NVLM family to not necessarily compete directly against ChatGPT-4o and Gemini 1.5 Pro, but rather serve as a foundation for third-party developers to build their own chatbots and AI applications.

Andrew Tarantola
Andrew Tarantola is a journalist with more than a decade reporting on emerging technologies ranging from robotics and machine…
ChatGPT can now generate images for free using Dall-E
ChatGPT results on an iPhone.

Since its launch last September, OpenAI's Dall-E 3 image generator has only been available to its Plus, Teams, and Enterprise subscribers. Now, nearly a year later, Dall-E is accessible to the rest of us — just with some stringent restrictions.

https://twitter.com/OpenAI/status/1821644904843636871

Read more
The ChatGPT app for Mac just got this helpful new feature
The OpenAI desktop app showing the text input window

OpenAI's recently released Mac desktop app is getting a bit easier to use. The company has announced that the program will now offer side-by-side access to the ChatGPT text prompt when you press Option + Space.

The desktop version offers nearly identical functionality to the web-based iteration. Users can chat directly with the AI, query the system using natural language prompts in either text or voice, search through previous conversations, and upload documents and images for analysis. You can even take screenshots of either the entire screen or just a single window, for upload.

Read more
ChatGPT Advanced Voice mode: release date, compatibility, and more
Nothing Phone 2a and ChatGPT voice mode.

Advanced Voice Mode is a new feature for ChatGPT that enables users to hold real-time, humanlike conversations with the AI chatbot without the need for a text-based prompt window or back-and-forth audio. It was released in late July to select Plus subscribers after being first demoed at OpenAI's Spring Update event.

According to the company, the feature “offers more natural, real-time conversations, allows you to interrupt at any time, and senses and responds to your emotions.” It can even take breath breaks and simulate human laughter during conversation. The best part is that access is coming soon, if you don't have it already.
When will I get Advanced Mode?
Introducing GPT-4o

Read more