NVIDIA Triton Accelerates Inference on Oracle Cloud

From Nvidia:

Thomas Park, an avid cyclist and software architect, utilized NVIDIA Triton Inference Server to design an AI platform for Oracle’s Vision AI service. This reduced OCI’s total cost of ownership by 10%, increased prediction throughput up to 76%, and latency was reduced up to 51% for OCI Vision and Document Understanding Service models.

OCI Vision AI handles object detection and image classification, aiding in bridge toll collection and automating invoice recognition. Triton’s success led to its adoption across other OCI services, with it being integrated into OCI’s Data Science service for customer convenience.

NVIDIA AI Enterprise platform includes Triton and is available on OCI Marketplace, making it even easier for users to embrace the fast, flexible inference server. OCI’s Data Science service serves tens of thousands of customers in various industries who build and use AI models of nearly every shape and size.

OCI’s Data Science service plans to evaluate NVIDIA TensorRT-LLM software to accelerate inference on large language models, and after deploying the latest NVIDIA H100, H200, L40S GPUs, and Grace Hopper Superchips, it is just the beginning of accelerated efforts to come.



Read more: NVIDIA Triton Accelerates Inference on Oracle Cloud