Google’s Gemma Optimized Across All NVIDIA AI Platforms
From NVIDIA:
NVIDIA and Google have optimized AI platforms for Gemma, open language models, for cost reduction and faster innovation on domain-specific use cases. NVIDIA TensorRT-LLM library and GeForce RTX GPUs enable developers to run Gemma, with over 100 million RTX GPUs available in high-performance AI PCs globally. Gemma can also run on NVIDIA GPUs in the cloud, including A3 instances and upcoming H200 Tensor Core GPUs.
Enterprise developers can take advantage of NVIDIA’s rich ecosystem of tools to fine-tune Gemma for their production applications. Additionally, TensorRT-LLM is revving up inference for Gemma, with model checkpoints and optimized versions available for developers. Gemma 2B and Gemma 7B can be experienced directly from any web browser on the NVIDIA AI Playground.
NVIDIA introduces Chat with RTX, an NVIDIA tech demo that uses retrieval-augmented generation and TensorRT-LLM software to give users generative AI capabilities on their local, RTX-powered Windows PCs. Users can personalize a chatbot with their own data by easily connecting local files to a large language model. Data processing is done locally, providing fast results and keeping user data on the device without sharing it with a third party or needing an internet connection.
Read more: Google’s Gemma Optimized Across All NVIDIA AI Platforms