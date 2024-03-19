AWS will offer Amazon EC2 instances based on NVIDIA Grace Blackwell GPUs and NVIDIA DGX Cloud to accelerate performance for building and running inference on multi-billion parameter LLMs

The integration of AWS Nitro System, Elastic Fabric Adapter encryption, and AWS Key Management Service with Blackwell encryption provides customers with end-to-end control of their training data and model weighting to provide Even stronger security for customer AI applications on AWS.

AWS and NVIDIA bring 20,736 GB200 superchips capable of processing 414 exaflops to Project Ceiba — a collaboration to build one of the fastest AI supercomputers exclusively on AWS on DGX Cloud for NVIDIA's own AI R&D

Amazon SageMaker integration with NVIDIA NIM inference microservices helps customers further optimize pricing performance of base models running on GPUs

AWS and NVIDIA Collaboration Accelerates AI Innovation in Healthcare and Life Sciences

General Terms and Conditions—Amazon Web Services (AWS), an Amazon.com Company (NASDAQ: AMZN) and NVIDIA (NASDAQ: NVDA) today announced that the new NVIDIA Blackwell GPU platform – revealed by NVIDIA at GTC 2024 — is coming to AWS. AWS will offer NVIDIA GB200 Grace Blackwell Superchip and B100 Tensor Core GPUs, extending the companies' long-standing strategic collaboration to provide the most secure and advanced infrastructure, software and services to help customers unlock new generative artificial intelligence (AI) capabilities.

NVIDIA and AWS continue to bring together the best of their technologies, including NVIDIA's latest multi-node systems featuring the NVIDIA Blackwell Platform and next-generation AI software, AWS Nitro, and advanced security AWS Key Management Service (AWS KMS), Elastic Fabric Adapter. (EFA) and UltraCluster Amazon Elastic Compute Cloud (Amazon EC2) large-scale clustering. Together, they provide the infrastructure and tools that enable customers to build and run real-time inference on multibillion-parameter Large Language Models (LLMs) faster, at scale, and at lower cost. to that of previous generation NVIDIA GPUs on Amazon EC2. .

“The close collaboration between our two organizations dates back more than 13 years, when we together launched the world's first GPU cloud instance on AWS, and today we offer the broadest range of NVIDIA GPU solutions to customers,” said Adam Selipsky, CEO of AWS. . “NVIDIA’s next-generation Grace Blackwell processor marks a significant advancement in generative AI and GPU computing. In combination with AWS's powerful Elastic Fabric Adapter Networking, Amazon EC2 UltraClusters' large-scale clustering, and the advanced virtualization and security capabilities of our unique Nitro system, we enable customers to build and run faster large language models with billions of parameters, at scale and more securely than anywhere else. Together, we continue to innovate to make AWS the best place to run NVIDIA GPUs in the cloud.

“AI is generating breakthroughs at an unprecedented pace, leading to new applications, business models and innovations across industries,” said Jensen Huang, founder and CEO of NVIDIA. “Our collaboration with AWS accelerates new generative AI capabilities and provides customers with unprecedented computing power to push the boundaries of what is possible.

The latest innovations from AWS and NVIDIA accelerate the training of cutting-edge LLMs that can scale beyond 1 trillion parameters.

AWS will offer the NVIDIA Blackwell platform, including the GB200 NVL72, with 72 Blackwell GPUs and 36 Grace processors interconnected by fifth-generation NVIDIA NVLink™. When connected to Amazon's powerful network (EFF), and supported by advanced virtualization (AWS Nitro System) and large-scale clustering (Amazon EC2 UltraClusters), customers can scale to thousands of GB200 superchips. NVIDIA Blackwell on AWS represents a big step forward in accelerating inference workloads for resource-intensive, multi-billion parameter language models.

Based on the success of NVIDIA H100-based EC2 P5 instances, which are available to customers for short durations via Amazon EC2 Capacity Blocks for ML, AWS plans to offer EC2 instances with the new B100 GPUs deployed in EC2 UltraClusters to accelerate the training and inference of generative AI at scale. The GB200s will also be available on NVIDIA DGX™Cloud, a co-built AI platform on AWS, that provides enterprise developers with dedicated access to the infrastructure and software needed to build and deploy advanced generative AI models. Blackwell-based DGX Cloud instances on AWS will accelerate the development of cutting-edge generative AI and LLMs that can scale beyond 1 trillion parameters.

Improve AI Security with AWS Nitro System, AWS KMS, Encrypted EFA, and Blackwell Encryption

As customers rapidly implement AI into their organization, they need to have confidence that their data is being handled securely throughout their training workflow. The security of model weights (the parameters that a model learns during training and which are essential to its ability to make predictions) is paramount to protecting customer intellectual property, preventing model tampering, and maintaining integrity. integrity of models.

AWS AI infrastructure and services already have security features to give customers control over their data and ensure it is not shared with third-party model providers. The combination of the AWS Nitro system and the NVIDIA GB200 takes AI security one step further by preventing unauthorized people from accessing model weights. The GB200 enables physical encryption of NVLink connections between GPUs and encrypts data transfer from the Grace CPU to the Blackwell GPU, while the EFA encrypts data on servers for distributed training and inference. The GB200 will also benefit from AWS Nitro, which offloads function I/O from the host CPU/GPU to specialized AWS hardware to deliver more consistent performance, while its enhanced security protects customer code and data during processing , both on the customer side. side and AWS side. This feature, available only on AWS, has been independently verified by NCC Groupa leading cybersecurity company.

With the GB200 on Amazon EC2, AWS will enable customers to create a trusted execution environment alongside their EC2 instance, using AWS Nitro Enclaves And AWSKMS. Nitro Enclaves allow customers to encrypt their workout data and weights with KMS, using key hardware under their control. The enclave can be loaded from the GB200 instance and can communicate directly with the GB200 Superchip. This allows KMS to communicate directly with the enclave and transmit keys to it in a cryptographically secure manner. The enclave can then transmit this material to the GB200, protected from the customer instance and preventing AWS operators from accessing the key or decrypting the training data or model weights, giving customers unprecedented control over their data.

Project Ceiba taps Blackwell to power NVIDIA's future generative AI innovation on AWS

Announced at AWS re:Invent 2023, Project Ceiba is a collaboration between NVIDIA and AWS to build one of the world's fastest AI supercomputers. Hosted exclusively on AWS, the supercomputer is available for NVIDIA research and development. This one-of-a-kind supercomputer with 20,736 B200 GPUs is built using the new NVIDIA GB200 NVL72, a system featuring fifth-generation NVLink, which scales to 20,736 B200 GPUs connected to 10,368 NVIDIA Grace processors. The system scales using a fourth-generation EFA network, delivering up to 800 Gbps per superchip of low-latency, high-bandwidth network throughput – capable of processing a massive 414 exaflops of AI – a performance increase 6 times greater than previous plans to build Ceiba on the Hopper architecture. NVIDIA R&D teams will use Ceiba to advance AI for LLMs, graphics (image/video/3D generation) and simulation, computational biology, robotics, self-driving cars, climate prediction NVIDIA Earth-2, and more, to help NVIDIA power the future. innovation in generative AI.

The collaboration between AWS and NVIDIA accelerates the development of generative AI applications and advances use cases in healthcare and life sciences.

AWS and NVIDIA have joined forces to deliver high-performance, low-cost inference for generative AI with the integration of Amazon SageMaker with NVIDIA NIM™ inference microservices, available with NVIDIA AI Enterprise. Customers can use this combination to quickly deploy FMs pre-compiled and optimized to run on NVIDIA GPUs to SageMaker, reducing time to market for generative AI applications.

AWS and NVIDIA have partnered to expand computer-aided drug discovery with new NVIDIA BioNeMo™ FM for generative chemistry, protein structure prediction, and understanding how drug molecules interact with targets. These new models will soon be available on AWS HealthOmics, a purpose-built service that helps healthcare and life sciences organizations store, query, and analyze genomic, transcriptomic, and other omics data.

The AWS HealthOmics and NVIDIA Healthcare teams are also working together to launch generative AI microservices to advance drug discovery, medtech, and digital health, providing a new catalog of GPU-accelerated cloud endpoints for biology, chemistry, imaging, and healthcare data so healthcare companies can take advantage of the latest advances in generative AI on AWS.