Nvidia's new Tesla cards meet the needs of the growing capacities of AI services

nvidia tesla p40 p4 deep neural network inferencing production accelerator
Now that Nvidia has addressed the consumer market with its latest graphics cards based on the “Pascal” architecture, the next solutions in the company’s Pascal rollout addresses the deep neural network market to accelerate machine learning. These solutions arrive in the form of Nvidia’s new Tesla P4 and Tesla P40 accelerator cards to speed up the inferencing production workloads carried out by services that use artificial intelligence.

There are essentially two types of accelerator cards for deep neural networks: training and inference. The former should speak for itself, accelerating the training of a deep neural network before it’s deployed in the field. Inference, however, is the process of providing an input to the deep neural network and having it extract data based on that input. That includes translating speech in real-time and localizing faces in images.

According to Nvidia, the new Tesla P4 and Tesla P40 accelerator cards are designed for inferencing and include specialized inference instructions based on 8-bit operations, making them 45 times faster in response time than an Intel Xeon E5-2690v4 processor. They also provide a 4x improvement over the company’s previous generation of “Maxwell” Tesla cards, the M40 and M4.

The company said this week during its GTC Beijing 2016 conference that the Tesla P4 sports a small form-factor that’s ideal for data centers. It’s 40x more energy efficient than CPUs that are used for inferencing, and a single Tesla P4 server can replace 13 CPU-only servers built for video inferencing workloads. Meanwhile, the Tesla P40 is ideal for deep learning workloads, with a server containing eight of these accelerators able to replace more than 140 CPU-based servers.

Compared to the previous Tesla M40, the new P40 packs more CUDA cores, higher clock speeds, a faster memory clock, a higher single precision of 12 TFLOPS, and a higher number of transistors at 12 billion. However, the power requirement (thermal envelope) stays the same, thus Nvidia has managed to boost the performance-per-watt level without forcing the card to require more power. The same holds true with the slower Tesla P4 model too when compared to the older Tesla M4 card.

“With the Tesla P100 and now Tesla P4 and P40, NVIDIA offers the only end-to-end deep learning platform for the data center, unlocking the enormous power of AI for a broad range of industries,” said Ian Buck, general manager of accelerated computing at Nvidia. “They slash training time from days to hours. They enable insight to be extracted instantly. And they produce real-time responses for consumers from AI-powered services.”

Nvidia revealed the Tesla P100 during its local GTC 2016 conference five months ago. This card is ideal for accelerating neural network training, delivering a performance increase of more than 12 times compared to the previous generation Maxwell-based solution. Again, neural networks need to be trained first before they’re deployed into the field, and the new Tesla card speeds up the process, cutting AI training down from weeks to days.

In addition to the two new Tesla cards, Nvidia also launched TensorRT, a library for “optimizing deep learning models for production deployment.” The company also introduced the Nvidia DeepStream SDK for simultaneously decoding and analyzing up to 93 HD video streams. However, here’s a brief list of hardware details for Nvidia’s two new Tesla cards that are now avaialble:

Tesla P40 Tesla P4
GPU GP102 GP104
CUDA Cores 3,840 2,560
Base Clock 1,303MHz 810MHz
Boost Clock 1,531MHz 1,063MHz
GDDR5 Memory Clock 7.2Gbps 6Gbps
Memory Bus Width 384-bit 256-bit
GDDR5 Amount 24GB 8GB
Single Precision 12 TFLOPS 5.5 TFLOPS
TDP 250 watts 50 to 75 watts

Tesla drops Full Self-Driving option from online configurator to spare ‘confusion’

Tesla no longer offers a Full Self-Driving (FSD) option for Models S, X, and 3 in the company's online configuration tool. CEO Elon Musk said the FSD option caused confusion. Its software validation and regulatory approval could take years.

Tesla keeps promise with more affordable Model 3 with midrange battery pack

Tesla is keeping its promise of making the Model 3 gradually more affordable. The company released a new variant of the car with a mid-range, 260-mile battery option that's priced under the $50,000 mark.

The world’s thinnest smartphone fits alongside your business cards

Kyocera has taken the wraps off of a new smartphone called the KY-O1L. The device is a tiny 5.3mm thick and is specifically billed as being a phone that can fit inside a business card holder.
Emerging Tech

Ekster 3.0 lets you ask, ‘Alexa, where did I leave my wallet?’

Ekster's newest smart wallet is its best yet. It's slimmer than ever, boasts a neat card-dispensing mechanism, and will even let you know where it is, thanks to smart speaker integration.
Emerging Tech

Looking for a good read? Here are the best, most eye-opening books about tech

Sometimes it's sensible to put down the gadgets and pick up a good old-fashioned book -- to read about the latest gadgets, of course. Here are the tech books you need to check out.

The 'Fallout 76' beta starts tomorrow! Here's when it starts and how to join

Want to get into Bethesda's Fallout 76 beta? We don't know when the program will launch, but we provide instructions on how to get ready. The game officially launches on November 14.

Samsung’s HMD Odyssey Plus gives you a clearer view into the virtual world

Samsung's refreshed HMD Odyssey+ promises to make Windows Mixed Reality experiences better by eliminating pixelated views caused by screen doors. The $500 headset also focuses on comfort this year with ergonomic improvements.

Intel denies rumors that 10nm Cannon Lake CPUs have been canned

Intel's long-in-development and oft-delayed, Cannon Lake 10nm CPU design has reportedly been canceled. Intel is denying the rumor, but if true, it could push back the release of new Intel chips by a long time.

Not to be outdone, Samsung says it’s making a laptop with a foldable display

Samsung announced that it is also working on a dual-screen computer. But rather than using two separate display panels, Samsung said that its novel laptop will come with a large flexible display that can fold when closed.

Free your digital memories, and frame them, with the best photo printers

Printed photos are experiencing a revival at the moment, but you don’t need to go to a special lab. Here’s our favorite options for making quality prints, from pocket-sized printers to wide-format photo printers capable of spitting out…

A new bug in the Windows 10 October 2018 Update could delete your files

The Windows 10 October 2018 Update has been on a rough path and in the latest set of issues, a new bug is impacting native zip file operations, potentially leading to overwritten files in some instances. 

Antivirus software has evolved a lot recently, and we need it more than ever

Everyone says you need it, but really is antivirus software, and how does it work? It depends on who you ask as different digital security companies employ different techniques to combat the latest malware threats.

Nvidia’s new GTX 1060 6GB could counter AMD’s rumored RX 590

Nvidia's GTX 1060 is about to get more powerful for new buyers, as the green team has introduced a new version with GDDR5X memory at its disposal. This could prove competitive with AMD's rumored RX 590.

A canceled education order is increasing hopes for new Macbook model

With Apple's October 30 event fast approaching, rumors continue to surface about new Macs and iPad models. In the latest news, a canceled education order is stoking hopes for a new MacBook model.