Skip to main content

Windows 11 will soon harness your GPU for generative AI

Following the introduction of Copilot, its latest smart assistant for Windows 11, Microsoft is yet again advancing the integration of generative AI with Windows. At the ongoing Ignite 2023 developer conference in Seattle, the company announced a partnership with Nvidia on TensorRT-LLM that promises to elevate user experiences on Windows desktops and laptops with RTX GPUs.

The new release is set to introduce support for new large language models, making demanding AI workloads more accessible. Particularly noteworthy is its compatibility with OpenAI’s Chat API, which enables local execution (rather than the cloud) on PCs and workstations with RTX GPUs starting at 8GB of VRAM.

Nvidia’s TensorRT-LLM library was released just last month and is said to help improve the performance of large language models (LLMs) using the Tensor Cores on RTX graphics cards. It provides developers with a Python API to define LLMs and build TensorRT engines faster without deep knowledge of C++ or CUDA.

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

With the release of TensorRT-LLM v0.6.0, navigating the complexities of custom generative AI projects will be simplified thanks to the introduction of AI Workbench. This is a unified toolkit facilitating the quick creation, testing, and customization of pretrained generative AI models and LLMs. The platform is also expected to enable developers to streamline collaboration and deployment, ensuring efficient and scalable model development.

A graph showing TensorRT-LLM inference performance on Windows 11.
Nvidia

Recognizing the importance of supporting AI developers, Nvidia and Microsoft are also releasing DirectML enhancements. These optimizations accelerate foundational AI models like Llama 2 and Stable Diffusion, providing developers with increased options for cross-vendor deployment and setting new standards for performance.

The new TensorRT-LLM library update also promises a substantial improvement in inference performance, with speeds up to five times faster. This update also expands support for additional popular LLMs, including Mistral 7B and Nemotron-3 8B, and extends the capabilities of fast and accurate local LLMs to a broader range of portable Windows devices.

The integration of TensorRT-LLM for Windows with OpenAI’s Chat API through a new wrapper will allow hundreds of AI-powered projects and applications to run locally on RTX-equipped PCs. This will potentially eliminate the need to rely on cloud services and ensure the security of private and proprietary data on Windows 11 PCs.

The future of AI on Windows 11 PCs still has a long way to go. With AI models becoming increasingly available and developers continuing to innovate, harnessing the power of Nvidia’s RTX GPUs could be a game-changer. However, it is too early to say whether this will be the final piece of the puzzle that Microsoft desperately needs to fully unlock the capabilities of AI on Windows PCs.

Editors' Recommendations

Kunal Khullar
A PC hardware enthusiast and casual gamer, Kunal has been in the tech industry for almost a decade contributing to names like…
I’ve reviewed every AMD and Nvidia GPU this generation — here’s how the two companies stack up
Three graphics cards on a gray background.

Nvidia and AMD make the best graphics cards you can buy, but choosing between them isn't easy. Unlike previous generations, AMD and Nvidia trade blows point-for-point in 2024, and picking a brand to go with isn't as easy as counting the dollars in your wallet.

I've reviewed every graphics card AMD and Nvidia have released this generation, comparing not only raw performance, but also features like DLSS and FSR, ray tracing performance, and how VRAM works in modern games. After dozens of graphics card reviews, here's how AMD and Nvidia stack up against each other in 2024.
Nvidia vs. AMD in 2024

Read more
How much does an AI supercomputer cost? Try $100 billion
A Microsoft datacenter.

It looks like OpenAI's ChatGPT and Sora, among other projects, are about to get a lot more juice. According to a new report shared by The Information, Microsoft and OpenAI are working on a new data center project, one part of which will be a massive AI supercomputer dubbed "Stargate." Microsoft is said to be footing the bill, and the cost is astronomical as the name of the supercomputer suggests -- the whole project might cost over $100 billion.

Spending over $100 billion on anything is mind-blowing, but when put into perspective, the price truly shows just how big a venture this might be: The Information claims that the new Microsoft and OpenAI joint project might cost a whopping 100 times more than some of the largest data centers currently in operation.

Read more
Windows 11 vs. Windows 10: finally time to upgrade?
The screen of the Surface Pro 9.

Windows 11 is the newest version of Windows, and it's one of the best Windows versions released. At launch, the operating system was very similar to Windows 10, but it has morphed a lot over the past several years. Now, Windows 11 has several key differences compared to Windows 10.

If you've been holding out on upgrading, we have everything you need to know about Windows 11 and how it's different than Windows 10 in this article. We'll detail the differences, as well as show you the areas where Windows 11 is growing faster than Windows 10.
Windows 11 vs. Windows 10: what's new

Read more