Skip to main content
  1. Home
  2. Computing
  3. News

Windows 11 will soon harness your GPU for generative AI

Add as a preferred source on Google

Following the introduction of Copilot, its latest smart assistant for Windows 11, Microsoft is yet again advancing the integration of generative AI with Windows. At the ongoing Ignite 2023 developer conference in Seattle, the company announced a partnership with Nvidia on TensorRT-LLM that promises to elevate user experiences on Windows desktops and laptops with RTX GPUs.

The new release is set to introduce support for new large language models, making demanding AI workloads more accessible. Particularly noteworthy is its compatibility with OpenAI’s Chat API, which enables local execution (rather than the cloud) on PCs and workstations with RTX GPUs starting at 8GB of VRAM.

Recommended Videos

Nvidia’s TensorRT-LLM library was released just last month and is said to help improve the performance of large language models (LLMs) using the Tensor Cores on RTX graphics cards. It provides developers with a Python API to define LLMs and build TensorRT engines faster without deep knowledge of C++ or CUDA.

With the release of TensorRT-LLM v0.6.0, navigating the complexities of custom generative AI projects will be simplified thanks to the introduction of AI Workbench. This is a unified toolkit facilitating the quick creation, testing, and customization of pretrained generative AI models and LLMs. The platform is also expected to enable developers to streamline collaboration and deployment, ensuring efficient and scalable model development.

A graph showing TensorRT-LLM inference performance on Windows 11.
Nvidia

Recognizing the importance of supporting AI developers, Nvidia and Microsoft are also releasing DirectML enhancements. These optimizations accelerate foundational AI models like Llama 2 and Stable Diffusion, providing developers with increased options for cross-vendor deployment and setting new standards for performance.

The new TensorRT-LLM library update also promises a substantial improvement in inference performance, with speeds up to five times faster. This update also expands support for additional popular LLMs, including Mistral 7B and Nemotron-3 8B, and extends the capabilities of fast and accurate local LLMs to a broader range of portable Windows devices.

The integration of TensorRT-LLM for Windows with OpenAI’s Chat API through a new wrapper will allow hundreds of AI-powered projects and applications to run locally on RTX-equipped PCs. This will potentially eliminate the need to rely on cloud services and ensure the security of private and proprietary data on Windows 11 PCs.

The future of AI on Windows 11 PCs still has a long way to go. With AI models becoming increasingly available and developers continuing to innovate, harnessing the power of Nvidia’s RTX GPUs could be a game-changer. However, it is too early to say whether this will be the final piece of the puzzle that Microsoft desperately needs to fully unlock the capabilities of AI on Windows PCs.

Kunal Khullar
Kunal Khullar is a computing writer at Digital Trends who contributes to various topics, including CPUs, GPUs, monitors, and…
Brave’s new Container feature is a lifesaver for anyone juggling multiple accounts
With this feature, you won't need to open three different browsers
Brave browser 3D logo

Brave has added Containers to its desktop browser, giving users a built-in way to keep different accounts, sessions, and browsing activity separate. The feature is available in Brave 1.92 for Windows, macOS, and Linux, and is rolling out in phases over the next few days.

Containers have been a highly requested feature, especially for users who regularly switch between work, personal, developer, or creator accounts. Once enabled, they let users open tabs in separate spaces where cookies and site storage are not shared outside that container.

Read more
Intel may bring back older desktop CPUs because DDR5 is getting too expensive
Older Intel Core CPUs from 10th to 14th Gen may get a second life
Intel Core i5-12400F box sitting in front of a gaming PC.

Intel may be preparing an unusual response to the ongoing memory crunch. According to Chinese outlet ITHome, citing ChannelGate, the company’s latest production plan includes restarting production of 13th-gen and 14th-gen Core processors.

The move is expected to increase supply across Intel’s 10th, 12th, 13th, and 14th Gen CPU families, especially in mainland China. For DIY PC builders, the timing is important. DDR5 memory prices have climbed sharply, making newer platforms harder to justify for anyone trying to build an affordable gaming PC.

Read more
Amazon wants to design in-house chips for Kindles, Fire TV, and Echo speakers
Apple did it first. Amazon is doing it now, starting with 40 million chips a year and a partner most people have never heard of.
Amazon Kindle Scribe dark mode featured image.

Apple's decision to design its own chips reshaped the consumer electronics industry. Amazon may be about to make the same call, just about two decades later.

Supply chain analyst Ming-Chi Kuo reports that Amazon is preparing to shift away from externally sourced processors for its consumer electronics lineup, marking what he describes as the company's first major processor procurement change in 20 years. The transition is expected to begin in 2027.

Read more