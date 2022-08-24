



Data center networks form the foundation of modern warehouse-scale and cloud computing. The fundamental promise of uniform, arbitrary communication between tens of thousands of servers with sub-100-second latency over hundreds of Gb/s of bandwidth has transformed computing and storage. The main advantage of this model is simple, but very important. Adding Incremental Servers or Storage Devices to a higher level Service increases the capacity and functionality of the Service proportionally. At Google, Google’s Jupiter data center network technology supports this kind of scale-out capability for basic services for users such as search, YouTube, Gmail, AI and machine learning, Compute Engine, BigQuery analytics, and cloud services like Spanner. doing. databases, and many more.

Over the past eight years, we have deeply integrated Optical Circuit Switching (OCS) and Wavelength Division Multiplexing (WDM) into Jupiter. Decades of conventional wisdom suggested that it was impractical to do so, but the combination of OCS and software-defined networking (SDN) architecture has enabled new capabilities. rice field. It improves performance and reduces latency, cost and power consumption. Real-time application priorities and communication patterns. Upgrade without downtime. Jupiter does all this while delivering 10% fewer flow completions, 30% better throughput, 40% lower power consumption, 30% lower cost, and 50x less downtime. increase. You can read more about how he did this in his paper “Jupiter Evolving: Transforming Google’s Datacenter Network via Optical Circuit Switches and Software-Defined Networking”, which he presented at SIGCOMM 2022 today.

Here’s an overview of this project:

The Evolving Jupiter Data Center Network

In 2015, we demonstrated how our Jupiter data center network scaled to over 30,000 servers, delivering 40Gb/s uniform connectivity per server and supporting over 1Pb/s aggregate bandwidth . Now Jupiter supports his data center bandwidth over 6Pb/sec. We achieved unprecedented levels of performance and scale by leveraging three ideas:

Software Defined Networking (SDN) – A logically centralized hierarchical control plane for programming and managing thousands of switching chips in a data center network.

Clos Topology – A non-blocking multi-stage switching topology built from smaller radix switch chips that scales to arbitrarily large networks.

Merchant Switch Silicon – Cost-effective commodity general-purpose Ethernet switching components for converging storage and data networks.

Building on these three pillars, Jupiter’s architectural approach has supported a significant shift in distributed systems architecture and established the way data center networks are built and managed as an industry.

However, two major challenges remained for hyperscale data centers. First, the data center network must be deployed on the scale of an entire building of infrastructure, perhaps 40MW or more. Additionally, the servers and storage devices deployed in buildings are constantly evolving. For example, we went from 40Gb/s to 100Gb/s to 200Gb/s and now to 400Gb/s native network interconnects. Therefore, data center networks must evolve dynamically to accommodate the new elements they connect.

Unfortunately, the Clos topology requires a spine layer that uniformly supports the fastest devices it may connect to, as shown below. Deploying his building-scale Clos-based data center network meant pre-deploying a very large spine layer running at a then-current-generation fixed speed. This is because the Clos topology inherently requires an all-to-all fanout from Aggregation Block 1 to the spine. Incremental addition to the spine requires recabling the entire data center. One way to support new devices operating at higher line rates is to replace the entire spine layer to support the new speed, but do so across hundreds of individual racks and buildings housing the switches. This is unrealistic given the tens of thousands of fiber pairs being used.

