Tuesday, September 27

NVIDIA Clean Sweeps MLPerf AI Benchmarks with Hopper H100 GPU, Up to 4.5X Performance Over A100 Ampere

NVIDIA’s Hopper H100 GPU debuted on the MLPerf AI Benchmark list and broke all previous records achieved by Ampere A100. As Hopper Tensor Core GPUs pave the way for the next big AI revolution, Ampere A100 GPUs continue to deliver cutting-edge performance in the mainstream AI application suite, while Jetson AGX Orin is at the cutting edge of edge computing.

NVIDIA’s AI revolution continues with the Hopper H100 Tensor Core GPU that smashes all MLPerf benchmarks, delivering up to 4.5x more performance than the last generation

Press release: In their debut on the industry-standard MLPerf AI benchmarks, NVIDIA H100 Tensor Core GPUs set world records for inference across all workloads, delivering up to 4.5x better performance than GPUs of the previous generation. The results demonstrate that Hopper is the premium choice for users who demand top performance on advanced AI models.

Offline scenario for data center and edge (single GPU)

Additionally, NVIDIA A100 Tensor Core GPUs and the NVIDIA Jetson AGX Orin Module for AI-Powered Robotics continued to deliver leading overall inference performance in all MLPerf tests: image recognition and speech, natural language processing and recommender systems.

The H100, aka Hopper, raised the performance bar per accelerator on all six neural networks on the lathe. It demonstrated throughput and speed leadership in distinct server and offline scenarios. The NVIDIA Hopper architecture delivered up to 4.5x more performance than NVIDIA Ampere architecture GPUs, which continue to provide overall leadership in MLPerf results.

Thanks in part to its Transformer Engine, Hopper excelled on the popular BERT model for natural language processing. It is one of the largest and most performance-intensive MLPerf AI models. These inference benchmarks mark the first public demonstration of the H100 GPUs, which will be available later this year. H100 GPUs will participate in future MLPerf rounds for training.

A100 GPUs show leadership

NVIDIA A100 GPUs, available today from leading cloud service providers and system manufacturers, continued to show overall consumer performance leadership on AI inference in the latest tests. A100 GPUs won more tests than any submission in the Data Center and Edge Computing categories and scenarios. In June, the A100 also secured overall leadership in the MLPerf training benchmarks, demonstrating its capabilities across the entire AI workflow.

A featured image of the NVIDIA GA100 matrix.

Since their debut in July 2020 on MLPerf, A100 GPUs have seen a 6x increase in performance, thanks to continuous improvements in NVIDIA AI software. NVIDIA AI is the only platform to run all MLPerf inference workloads and scenarios in data centers and edge computing.

Users need versatile performance

The ability of NVIDIA GPUs to deliver cutting-edge performance across all major AI models makes users the real winners. Their real-world applications typically use many different types of neural networks.

For example, an AI application might need to understand a user’s voice request, classify an image, make a recommendation, and then provide a response as a spoken message in a human-sounding voice. Each stage requires a different type of AI model.

MLPerf benchmarks cover these popular AI workloads and scenarios as well as others – computer vision, natural language processing, recommender systems, speech recognition, and more. Testing ensures that users will get reliable performance and flexible to deploy.

Users rely on MLPerf results to make informed purchasing decisions because the tests are transparent and objective. The references have the support of a large group including Amazon, Arm, Baidu, Google, Harvard, Intel, Meta, Microsoft, Stanford and the University of Toronto.

Orin leads to the edge

In edge computing, NVIDIA Orin ran every MLPerf benchmark, winning more tests than any other low-power system-on-chip. And it showed up to a 50% fuel efficiency gain over its debut on MLPerf in April. In the previous round, Orin ran up to 5x faster than the previous generation Jetson AGX Xavier module, while delivering 2x the average power efficiency.

Orin integrates an NVIDIA Ampere architecture GPU and a cluster of powerful Arm processor cores into a single chip. It is available today in the NVIDIA Jetson AGX Orin development kit and production modules for robotics and autonomous systems and supports the entire NVIDIA AI software stack, including platforms for autonomous vehicles (NVIDIA Hyperion), medical devices (Clara Holoscan) and robotics (Isaac) .