Google Cloud upgrades AI infrastructure with Trillium TPU and Nvidia-powered VMs

Google Cloud unveils Trillium TPU and Nvidia VMs boosting AI speed and efficiency.

: Google Cloud has introduced significant upgrades to its AI infrastructure with the Trillium TPU and Nvidia-powered VMs. The Trillium TPU, Google's sixth-generation unit, increases training performance fourfold and inference throughput threefold compared to the TPU v5e. Additionally, new A3 Ultra Virtual Machines powered by Nvidia's H200 GPUs are set to enhance performance and scalability for AI applications. These advancements empower industries to develop and employ sophisticated AI models with improved efficiency.

Google Cloud has announced a major upgrade to its AI infrastructure, featuring the new Trillium Tensor Processing Unit (TPU) and Nvidia-powered Virtual Machines (VMs). The Trillium TPU, a sixth-generation unit, surpasses its predecessor, the TPU v5e, with a fourfold increase in training performance and a threefold boost in inference throughput, while also being 67% more energy-efficient.

The Trillium TPU is engineered with double the High Bandwidth Memory capacity and Interchip Interconnect bandwidth, which makes it ideal for processing large language models such as Gemma 2 and Llama, as well as dealing with compute-intensive tasks like those required by models such as Stable Diffusion XL. Trillium's scalability stands out, allowing integration of up to 256 chips in a single pod, with potential to expand to hundreds of pods, forming a supercomputer facilitated by Google's Jupiter data center network.

Google Cloud's enhancements also include the upcoming A3 Ultra VMs featuring Nvidia's H200 GPUs, promising twice the networking bandwidth of their A3 Mega counterparts. These VMs, boasting nearly twice the memory capacity and 1.4 times greater bandwidth, are set to double performance for large language model inference tasks. Additionally, Google Cloud has launched the Hypercompute Cluster, which simplifies deployment and management of large-scale AI infrastructure by allowing customers to manage thousands of accelerators as a unified unit.