How TensorRT Accelerates AI on RTX PCs
From NVIDIA: [published_date]
Generative AI applications benefit from running locally on PCs with NVIDIA RTX GPUs equipped with Tensor Cores for reduced latency and increased control over data. Stable Video Diffusion is now optimized for NVIDIA TensorRT, offering enhanced performance on Windows PCs. The UL Procyon AI Image Generation benchmark shows a 50% speedup with TensorRT.
TensorRT accelerates popular generative AI models like Stable Diffusion and SDXL, delivering up to 2x performance compared to other frameworks. Stable Video Diffusion 1.1 Image-to-Video model sees a 40% speedup, with ControlNets being 40% faster with TensorRT. Users can download these optimized models on Hugging Face.
TensorRT optimizations extend to ControlNets, allowing users to guide generative outputs by adding extra conditions for improved customization. This extension boosts performance by up to 2x, enabling greater control over final images. ControlNets can be depth maps, edge maps, normal maps, or keypoint detection models.
DaVinci Resolve AI tools run over 50% faster with NVIDIA TensorRT integration, while Topaz Labs’ Photo AI and Video AI apps see up to 60% performance increase on RTX GPUs. Tensor Cores combined with TensorRT software offer unmatched generative AI performance locally, unlocking benefits such as lower latency, cost savings, always-on access, and data privacy.
NVIDIA TensorRT-LLM optimizes LLM inference with support for popular models like Phi-2, Llama2, Gemma, Mistral, and Code Llama. The open-source library includes connectors to popular application frameworks like LlamaIndex and LangChain, making it easy for developers to leverage TensorRT-LLM for the best LLM performance on RTX GPUs. Subscribe to the AI Decoded newsletter for weekly updates.
Read more at NVIDIA: How TensorRT Accelerates AI on RTX PCs