Skip to main content

Windows 11 will soon harness your GPU for generative AI

Following the introduction of Copilot, its latest smart assistant for Windows 11, Microsoft is yet again advancing the integration of generative AI with Windows. At the ongoing Ignite 2023 developer conference in Seattle, the company announced a partnership with Nvidia on TensorRT-LLM that promises to elevate user experiences on Windows desktops and laptops with RTX GPUs.

The new release is set to introduce support for new large language models, making demanding AI workloads more accessible. Particularly noteworthy is its compatibility with OpenAI’s Chat API, which enables local execution (rather than the cloud) on PCs and workstations with RTX GPUs starting at 8GB of VRAM.

Nvidia’s TensorRT-LLM library was released just last month and is said to help improve the performance of large language models (LLMs) using the Tensor Cores on RTX graphics cards. It provides developers with a Python API to define LLMs and build TensorRT engines faster without deep knowledge of C++ or CUDA.

Get your weekly teardown of the tech behind PC gaming
Check your inbox!

With the release of TensorRT-LLM v0.6.0, navigating the complexities of custom generative AI projects will be simplified thanks to the introduction of AI Workbench. This is a unified toolkit facilitating the quick creation, testing, and customization of pretrained generative AI models and LLMs. The platform is also expected to enable developers to streamline collaboration and deployment, ensuring efficient and scalable model development.

A graph showing TensorRT-LLM inference performance on Windows 11.
Nvidia

Recognizing the importance of supporting AI developers, Nvidia and Microsoft are also releasing DirectML enhancements. These optimizations accelerate foundational AI models like Llama 2 and Stable Diffusion, providing developers with increased options for cross-vendor deployment and setting new standards for performance.

The new TensorRT-LLM library update also promises a substantial improvement in inference performance, with speeds up to five times faster. This update also expands support for additional popular LLMs, including Mistral 7B and Nemotron-3 8B, and extends the capabilities of fast and accurate local LLMs to a broader range of portable Windows devices.

The integration of TensorRT-LLM for Windows with OpenAI’s Chat API through a new wrapper will allow hundreds of AI-powered projects and applications to run locally on RTX-equipped PCs. This will potentially eliminate the need to rely on cloud services and ensure the security of private and proprietary data on Windows 11 PCs.

The future of AI on Windows 11 PCs still has a long way to go. With AI models becoming increasingly available and developers continuing to innovate, harnessing the power of Nvidia’s RTX GPUs could be a game-changer. However, it is too early to say whether this will be the final piece of the puzzle that Microsoft desperately needs to fully unlock the capabilities of AI on Windows PCs.

Kunal Khullar
Kunal is a Computing writer contributing content around PC hardware, laptops, monitors, and more for Digital Trends. Having…
Grok 2.0 takes the guardrails off AI image generation
Elon Musk as Wario in a sketch from Saturday Night Live.

Elon Musk's xAI company has released two updated iterations of its Grok chatbot model, Grok-2 and Grok-2 mini. They promise improved performance over their predecessor, as well as new image-generation capabilities that will enable X (formerly Twitter) users to create AI imagery directly on the social media platform.

“We are excited to release an early preview of Grok-2, a significant step forward from our previous model, Grok-1.5, featuring frontier capabilities in chat, coding, and reasoning. At the same time, we are introducing Grok-2 mini, a small but capable sibling of Grok-2. An early version of Grok-2 has been tested on the LMSYS leaderboard under the name 'sus-column-r,'” xAI wrote in a recent blog post. The new models are currently in beta and reserved for Premium and Premium+ subscribers, though the company plans to make them available through its Enterprise API later in the month.

Read more
Nvidia reportedly caught scraping AI data from Netflix and YouTube (again)
Nvidia CEO Jensen in front of a background.

According to a damning report from 404 Media, backed with internal Slack chats, emails, and documents obtained by the outlet, Nvidia helped itself to "a human lifetime visual experience worth of training data per day," Ming-Yu Liu, vice president of Research at Nvidia and a Cosmos project leader, admitted in a May email.

Unnamed former Nvidia employees told 404 that they had been asked to scrape video content from Netflix, YouTube, and other online sources in order to obtain training data for use with the company's various AI products. Those include Nvidia’s Omniverse 3D world generator, self-driving car systems, and “digital human.”

Read more
PC gamers still prefer Windows 10 over Windows 11
A man stands in front of a gaming PC.

Windows 11 saw a decline in the latest Steam hardware and software survey for July 2024. According to Valve's data, gamers using Microsoft's newer operating system dropped below the 46% threshold. Currently, Windows 11 accounts for approximately 45.81% of all Windows users on Steam, marking a decrease of 0.82% from the previous month.

In contrast, Windows 10 experienced an increase of 0.74%, reaching a 50.16% share. Although gaming performance is generally similar on both operating systems, a recent test by Hardware Unboxed reveals that Windows 10 may offer better performance in certain titles due to the core isolation feature, where memory integrity is enabled by default on Windows 11.

Read more