NVIDIA achieved a record-breaking benchmark result of 410 trillion traversed edges per second on the Graph500 breadth-first search list using an accelerated computing cluster with 8,192 H100 GPUs. This performance is more than double that of comparable solutions, enabling the search through every friend relationship on Earth in just three milliseconds.
The winning run from NVIDIA used just over 1,000 nodes, delivering 3x better performance per dollar compared to other solutions on the Graph500 list. By tapping into the power of its full-stack compute, networking, and software technologies, NVIDIA’s system demonstrated both speed and efficiency at scale.
Graphs are the foundation of modern technology, capturing relationships between pieces of information in massive webs of data. The Graph500 BFS benchmark measures a system’s ability to navigate irregular graphs at scale, showcasing superior interconnects, memory bandwidth, and software capabilities.
To process massive graphs efficiently, NVIDIA engineered a full-stack, GPU-only solution that reimagines how data moves across the network. Using custom software frameworks and GPU-to-GPU active messaging, NVIDIA’s system bypasses the CPU to take full advantage of the parallelism and memory bandwidth of H100 GPUs.
NVIDIA’s breakthrough in graph processing has significant implications for high-performance computing fields like fluid dynamics and weather forecasting. By validating a new approach for HPC at scale, NVIDIA’s technology enables developers to efficiently scale their largest applications using technologies like NVSHMEM and IBGDA.
Read more at NVIDIA.: How NVIDIA H100 GPUs on CoreWeave’s AI Cloud Platform Delivered a Record-Breaking Graph500 Run
