Nvidia achieves massive increase in AI computing power with Volta and Tesla V100

nvidia announces volta and tesla v100 gpu for ai computing data center pcie 625 ud 2x

Artificial intelligence (AI) and machine learning are incredibly important technological developments that require massive amounts of computing power. Microsoft is currently holding its Build 2017 developers conference, and there isn’t a product or service being highlighted that doesn’t integrate AI or machine learning in one way or another.

One of the best ways to architect the right kind of high-speed computing infrastructure is by using GPUs, which can be more efficient than general-purpose CPUs. Nvidia has been at the forefront of using GPUs for AI and machine learning applications, and it has just announced the Volta GPU computing architecture and the Tesla V100 data center GPU.

Nvidia calls Volta the “world’s most powerful,” and it is built with 21 billion transistors providing deep learning performance equivalent to 100 CPUs. That equates to five times the performance of its Pascal architecture in terms of peak teraflops, and 15 times the performance of its previous Maxwell architecture. According to Nvidia, Volta performance quadruples the improvement that Moore’s law would have predicted.

According to Jensen Huang, Nvidia founder and CEO, “Deep learning, a groundbreaking AI approach that creates computer software that learns, has insatiable demand for processing power. Thousands of Nvidia engineers spent over three years crafting Volta to help meet this need, enabling the industry to realize AI’s life-changing potential.”

nvidia announces volta and tesla v100 gpu for ai computing data center inference performance chart 297 m 2x

In addition to the Volta architecture, Nvidia also unveiled the Tesla V100 data center GPU, which incorporates a number of new technologies. They include the following, taken from Nvidia’s announcement:

  • Tensor Cores designed to speed AI workloads. Equipped with 640 Tensor Cores, V100 delivers 120 teraflops of deep learning performance, equivalent to the performance of 100 CPUs.
  • New GPU architecture with over 21 billion transistors. It pairs CUDA cores and Tensor Cores within a unified architecture, providing the performance of an AI supercomputer in a single GPU.
  • NVLink provides the next generation of high-speed interconnect linking GPUs, and GPUs to CPUs, with up to 2x the throughput of the prior generation NVLink.
  • 900 GB/sec HBM2 DRAM, developed in collaboration with Samsung, achieves 50 percent more memory bandwidth than previous generation GPUs, essential to support the extraordinary computing throughput of Volta.
  • Volta-optimized software, including CUDA, cuDNN and TensorRT software, which leading frameworks and applications can easily tap into to accelerate AI and research.

A number of organizations are planning to utilize Volta in their applications, including Amazon Web Services, Baidu, Facebook, Google, and Microsoft. As AI and machine learning are integrated more closely into the technology we use every day, it’s likely to be solutions like Volta and Tesla V100 GPUs that are powering them.