The top 10 most intelligent open-source models all utilize a mixture-of-experts architecture, with models like Kimi K2 Thinking and Mistral Large 3 running 10x faster on NVIDIA GB200 NVL72 systems. This architecture mimics the efficiency of the human brain, activating specialized experts for each task, resulting in faster token generation without increased compute demands.
MoE models have become the standard for frontier models, offering higher intelligence and adaptability without a rise in computational costs. Over 60% of open-source AI models released this year use MoE architecture, showcasing a nearly 70x increase in model intelligence since early 2023. MoE architecture is driving the advancement of AI capabilities.
Scaling MoE models presents challenges, but extreme codesign like the NVIDIA GB200 NVL72 system addresses these bottlenecks. This rack-scale system enables MoE models to distribute experts across multiple GPUs, reducing memory limitations and accelerating expert communication. Full-stack optimizations further enhance performance and efficiency for MoE models in production environments.
NVIDIA GB200 NVL72 delivers a 10x leap in performance per watt for MoE models, transforming the economics of AI at scale in power- and cost-constrained data centers. This performance advantage extends to leading frontier models like Kimi K2 Thinking and Mistral Large 3, achieving 10x better generational performance on the GB200 NVL72 platform. Leading AI service providers are deploying GB200 NVL72 to unlock the full potential of MoE models.
The future of AI lies in specialized multimodal models and agentic systems, mirroring the MoE architecture to route tasks to relevant experts. By enabling a shared pool of experts accessible to multiple applications and agents, efficiency and scale can coexist. The NVIDIA GB200 NVL72 system paves the way for this future, unlocking the potential of MoE architecture for diverse applications and users.
Read more at Nvidia: Mixture of Experts Powers the Most Intelligent Frontier Models
