The NVIDIA Blackwell platform is revolutionizing AI with leading inference providers adopting it to reduce cost per token by up to 10x. The new NVIDIA Blackwell Ultra platform is driving momentum for agentic AI, with AI agents and coding assistants driving a significant increase in AI queries.
The NVIDIA Blackwell Ultra platform has delivered breakthrough advances in performance, with up to 50x higher throughput per megawatt and 35x lower cost per token compared to the NVIDIA Hopper platform. This extreme codesign approach across chips, system architecture, and software accelerates performance across AI workloads while reducing costs at scale.
The NVIDIA GB300 NVL72 system has been shown to deliver more than 10x more tokens per watt and one-tenth the cost per token compared to the NVIDIA Hopper platform. Continuous optimizations from NVIDIA’s software teams have significantly boosted Blackwell NVL72 throughput for mixture-of-experts (MoE) inference, with up to 5x better performance for low-latency workloads.
The GB300 NVL72 system with the Blackwell Ultra GPU pushes the throughput-per-megawatt frontier to 50x compared to the Hopper platform. This results in superior economics, with up to 35x lower cost per million tokens compared to the Hopper platform, particularly for agentic coding and interactive assistants workloads.
For long-context scenarios such as AI coding assistants reasoning across codebases, the GB300 NVL72 system delivers up to 1.5x lower cost per token compared to the GB200 NVL72 system. The Blackwell Ultra GPU enables efficient understanding of entire codebases with 1.5x higher NVFP4 compute performance and 2x faster attention processing.
Leading cloud providers like Microsoft, CoreWeave, and OCI are deploying the NVIDIA GB300 NVL72 system for low-latency and long-context use cases such as agentic coding and coding assistants. This enables a new class of applications that can reason across massive codebases in real time, reducing token costs and improving inference efficiency.
Read more at NVIDIA: New SemiAnalysis InferenceX Data Shows NVIDIA Blackwell Ultra Delivers up to 50x Better Performance and 35x Lower Costs for Agentic AI
