NVIDIA Ethernet Networking Accelerates World’s Largest AI Supercomputer, Built by xAI

From Nvidia: 2024-10-28 11:00:00

NVIDIA’s xAI Colossus supercomputer, powered by 100,000 NVIDIA Hopper GPUs, utilizes the Spectrum-X Ethernet networking platform for superior performance in training large language models like Grok, with chatbots for X Premium subscribers. The facility was built in just 122 days, achieving unprecedented network performance with zero latency degradation.

The Colossus supercomputer, currently the largest in the world, is being expanded to 200,000 NVIDIA Hopper GPUs by xAI. The system’s exceptional performance during the training of the Grok model sets a new standard in AI processing, analysis, and execution. The Spectrum-X platform enables faster development and deployment of AI solutions with increased scalability and efficiency.

Elon Musk praises xAI, NVIDIA, and partners for creating the most powerful training system with Colossus. The combination of NVIDIA Hopper GPUs and Spectrum-X networking technology allows xAI to advance AI model training at an unprecedented scale, optimizing performance based on Ethernet standards.

The Spectrum-X platform, featuring the Spectrum SN5600 Ethernet switch and NVIDIA BlueField-3 SuperNICs, offers port speeds of up to 800Gb/s and advanced features like adaptive routing and congestion control. This technology delivers highly effective bandwidth and low latency essential for multi-tenant generative AI clouds and enterprise environments.



Read more at Nvidia: NVIDIA Ethernet Networking Accelerates World’s Largest AI Supercomputer, Built by xAI