NVIDIA Blackwell Delivers Next-Level MLPerf Training Performance
From NVIDIA: 2024-11-13 11:00:01
Generative AI applications, including text, code, protein chains, and 3D graphics, require data-center-scale accelerated computing. NVIDIA Blackwell platform outperformed in MLPerf Training 4.1 benchmarks, achieving up to 2.2x more performance per GPU on large language models. NVIDIA Hopper platform held at-scale records, with one submission using 11,616 GPUs for GPT-3 175B benchmark.
Blackwell architecture advances generative AI training with new kernels optimized for Tensor Cores. Higher GPU compute throughput and faster memory allow GPT-3 175B benchmark to run on fewer GPUs with excellent performance. Blackwell’s efficiency is highlighted by running GPT-3 LLM benchmark with 64 GPUs compared to 256 GPUs needed with Hopper.
NVIDIA platforms continue relentless optimization, delivering performance and feature improvements in training and inference. Hopper GPUs achieved a 1.3x improvement in GPT-3 175B per-GPU training performance since benchmark introduction. Large-scale results on GPT-3 175B benchmark using 11,616 Hopper GPUs with high-bandwidth interconnectivity showcased significant performance gains.
NVIDIA partners, including ASUSTek, Azure, Cisco, Dell, and more, submitted impressive results to MLPerf. NVIDIA emphasizes the importance of industry-standard benchmarks in AI computing, providing vital data for platform investment decisions. Ongoing optimization of accelerated computing platforms drives performance improvements in MLPerf test results, benefiting partners and customers alike.
Read more at NVIDIA: NVIDIA Blackwell Delivers Next-Level MLPerf Training Performance