



Google Cloud recently announced a preview of Batch, a managed service for running batch jobs at scale. This new service supports the latest T2A Arm-based instances and Spot VMs for large batch jobs with task parallelization.

Batch handles dynamic resource provisioning and autoscaling, executes requests in parallel, supports scripts and containerized workloads, and can leverage native Google Cloud services and batch tools. Shamel Jacobs, his product manager at Google, and Bolian Yin, software engineer at Google, write:

Batch processing is as old as computing itself, with the term “batch” dating back to punch cards used in early mainframes (…) Batch jobs are widely used in research, simulation, genomics, vision It is particularly prevalent in areas such as effects, fintech, and manufacturing. and EDA.

The new service supports common job types such as arrays of jobs and multi-node MPI applications. Jacobs and Yin emphasize that his Batch is not the only service that handles batch processing on Google Cloud.

Batch is a general-purpose batch job service, the latest in a long list of products we’ve created over the years to process jobs that help companies move their workloads to the cloud. These services include Cloud Life Sciences (formerly Google Genomics), Dataflow, and Cloud Run Jobs.

Source: https://cloud.google.com/blog/products/compute/new-batch-service-processes-batch-jobs-on-google-cloud

Key concepts of the new service are jobs, execution of computational work from run to completion, tasks running on Compute Engine instances, array jobs, multiple tasks within a job running the same executable and resources simultaneously, e.g. , Compute Engine instances, Cloud Storage, NFS mounts, and more. AMD Director Lewis Carroll commented:

The T2D Tau VM with batch should be a monster for large-scale life sciences, chemical, derivatives pricing, risk, and other massively parallel distributed computing jobs.

The cloud provider has released a media transcoding tutorial that leverages Batch to transcode H.264 video files to VP9. Busybox, project running containers as batch jobs, primegen, end-to-end sample using workflows and Cloud Build with Batch, wrf, sample application running weather research and forecasting models in batch jobs using MPIB, GitHub Here are other examples available in .

Developers can access Batch through APIs, command line tools, workflow engines, or the console to define job priorities and establish retry strategies. The service can run on the HPC Toolkit, a Google Cloud open source project for deploying high performance computing environments, according to the cloud provider:

Using Google Cloud Batch with the HPC Toolkit simplifies the setup required to provision and run more complex scenarios, such as setting up shared file systems and installing software used by Google Cloud Batch jobs. We can also share tested infrastructure solutions that work with Google Cloud Batch via HPC Toolkit blueprints.

Currently in preview, Batch is available in a subset of Google Cloud regions in Iowa, South Carolina, Oregon, and Finland. There are no additional charges for using Batch. Customers pay for the resources used to run their jobs.

