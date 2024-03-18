



New AI infrastructure delivery and integration enables more open and accessible AI

GTC — Google Cloud and NVIDIA today announced the deepening of their partnership to give the machine learning (ML) community access to technology that accelerates efforts to easily build, scale, and manage the AI ​​applications it generates.

To continue bringing AI breakthroughs to our products and developers, Google announced the adoption of the new NVIDIA Grace Blackwell AI computing platform and NVIDIA DGX Cloud service on Google Cloud. Additionally, his DGX™ Cloud platform, powered by NVIDIA H100, is now generally available on Google Cloud.

Building on recent collaborations to optimize the Gemma family of open models, Google has adopted NVIDIA NIM inference microservices, which are open and flexible for developers to train and deploy using their favorite tools and frameworks. We provide a platform for The companies also announced support for His JAX on NVIDIA GPUs and His Vertex AI instances powered by NVIDIA H100 and L4 Tensor Core GPUs.

“The strength of our long-term partnership with NVIDIA starts at the hardware level and extends across our entire portfolio, from cutting-edge GPU accelerators to our software ecosystem and managed Vertex AI platform,” said Google Cloud CEO. I am. Thomas Kurian. “Our team is committed to working with NVIDIA to provide ML developers with an accessible, open, and comprehensive AI platform.”

“Enterprises are looking for solutions that can take full advantage of generative AI in weeks and months, not years,” said Jensen Huang, Founder and CEO of NVIDIA. “With expanded infrastructure services and new integrations with NVIDIA's full-stack AI, Google Cloud continues to provide customers with an open, flexible platform that allows them to easily scale their generative AI applications.”

NVIDIA's new integration with Google Cloud builds on the company's long-standing commitment to providing the AI ​​community with cutting-edge capabilities at every layer of the AI ​​stack. Key elements of the expanded partnership include:

Adopt NVIDIA Grace Blackwell: The new Grace Blackwell platform enables organizations to build and run real-time inference on large-scale language models with trillions of parameters. Google has adopted this platform for various internal deployments and will be one of the first cloud providers to offer Blackwell-powered instances.

DGX Cloud powered by Grace Blackwell now on Google Cloud: Google is introducing the NVIDIA GB200 NVL72 system, which combines 72 Blackwell GPUs and 36 Grace CPUs interconnected by 5th generation NVLink®, to deliver scalability and performance. Deploy on high cloud infrastructure. Designed for energy-efficient training and inference in the era of trillion-parameter LLMs, the NVIDIA GB200 NVL72 system is available via DGX Cloud, an AI platform that provides a serverless experience for enterprise developers building and delivering LLMs. It will be. DGX Cloud is now generally available on Google Cloud A3 VM instances with NVIDIA H100 Tensor Core GPUs.

JAX support on GPUs: Google Cloud and NVIDIA have teamed up to bring the benefits of JAX to NVIDIA GPUs, expanding access to large-scale LLM training among the broader ML community. JAX is a compiler-oriented, Python-native, high-performance machine learning framework that is one of the easiest to use and highest-performing frameworks for LLM training. An AI practitioner can now use JAX on his NVIDIA H100 GPU on Google Cloud through MaxText and Accelerated Processing Kit (XPK).

NVIDIA NIM on Google Kubernetes Engine (GKE): NVIDIA NIM inference microservices, part of the NVIDIA AI enterprise software platform, are integrated into GKE. Built on inference engines such as TensorRT-LLM™, NIM accelerates enterprise adoption of generative AI, supports a wide range of leading AI models, and ensures seamless and scalable AI inference.

NVIDIA NeMo support: Google Cloud now makes it easy to deploy the NVIDIA NeMo™ framework across the platform through Google Kubernetes Engine (GKE) and Google Cloud HPC Toolkit. This enables developers to automate and scale the training and delivery of generative AI models, enabling rapid deployment of turnkey environments through customizable blueprints that jump-start the development process. His NVIDIA NeMo, part of NVIDIA AI Enterprise, is also available on Google Marketplace, giving customers another way to easily access his NeMo and other frameworks to accelerate their AI development.

Vertex AI and Dataflow expand support for NVIDIA GPUs: To advance data science and analytics, Vertex AI now supports Google Cloud A3 VMs with NVIDIA H100 GPUs and G2 VMs with NVIDIA L4 Tensor Core GPUs Now it looks like this. This provides MLOps teams with scalable infrastructure and tools to manage and deploy AI applications with confidence. Dataflow has also expanded support for high-speed data processing on NVIDIA GPUs.

Google Cloud has long offered GPU VM instances that combine cutting-edge hardware from NVIDIA with cutting-edge innovation from Google. NVIDIA GPUs are a core component of the Google Cloud AI hypercomputer, a supercomputing architecture that brings together performance-optimized hardware, open software, and flexible consumption models. This comprehensive partnership enables AI researchers, scientists, and developers to train, fine-tune, and deliver the largest and most sophisticated AI models, and brings together even more of their favorite tools and frameworks. and now available on Google Cloud.

“Runway's text-to-video platform leverages an AI hypercomputer. At base, the A3 VM with NVIDIA H100 GPUs provides significantly better training performance than the A2 VM and faster performance than the Gen-2 model. Training and inference is now possible at scale. By orchestrating training jobs using GKE, you can scale to thousands of H100 GPUs in a single fabric to meet your growing demands. can.”

“By moving to Google Cloud and leveraging the AI ​​hypercomputer architecture with NVIDIA T4 GPUs, G2 VMs with NVIDIA L4 GPUs, and Triton Inference Server, we significantly improved model inference performance and at the same time We were able to reduce our hosting costs by 15% using new technology enabled by Cloud, and the flexibility that Google Cloud provides.”

“All of Writer's platforms are integrated through highly productive partnerships with Google and NVIDIA. NVIDIA GPUs can be optimally used for training and inference. We leverage NVIDIA NeMo to deliver 990,000 We build powerful models for industries that generate word and make over 1 trillion API calls per month. We build powerful models for industries that generate over 1 trillion API calls per month. quality models. All of this is made possible by the partnership between Google and NVIDIA. The benefits of their AI expertise are passed on to our enterprise customers, in days, not months or years. Now you can build meaningful AI workflows.”

Learn more about the collaboration between Google Cloud and NVIDIA at Global AI Conference GTC (Booth #808), March 18-21.

