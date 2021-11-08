



AMD announced on Monday the Instinct MI200 Accelerator, the latest generation of data center GPUs. Chipmakers say they are the fastest HPC and AI accelerators, surpassing the record set by the MI100 deployed last year.

According to AMD, the Instinct MI200 delivers up to 4.9 times higher performance computing than existing data center GPUs. The company also claims to be the fastest in AI training, offering up to 1.2x higher peak flops for mixed accuracy performance.

The accelerator contains 58 billion transistors manufactured with 6nm technology. This allows up to 220 compute units, increasing the computational density by more than 80% compared to MI100. It’s also the world’s first GPU with 128GB of HBM2E memory.

It is the world’s first multi-die GPU with AMD’s second-generation cDNA architecture. AMD announced the cDNA architecture last year when it bisected the design of data centers and GPUs for gaming. The CDNA architecture is specially designed to optimize the computing workload in the data center.

“Of course, these workloads run on very different systems, so splitting them into two products and two chip families was an easy way to design a better product,” said the Data Canter GPU. Accelerator AMD VP Brad McCreadie told reporters last week. ..

The new MI200 accelerator is about five times faster than Nvidia’s A100 GPU in FP64 peak performance. This is important for HPC workloads that require high accuracy, such as weather forecasts. Its peak FP32 vector performance is about 2.5 times faster. AMD pointed out that this is important for the type of math operation used in vaccine simulation.

AMD has also unwrapped Milan-X, the first server CPU with 3D Chiplet technology. It will be officially released in the first quarter of 2022.

These processors have three times the L3 cache compared to standard Milan processors. In Milan, each CCD had 32MB of cache. In Milan-X, AMD brings 96MB per CCD. The CPU has a total of 804 MB of cache for each socket at the top of the stack, which offloads memory bandwidth and reduces latency. This dramatically improves the performance of your application.

At the socket level, Milan-X is the fastest server processor for technical computing workloads, with more than 50% more targeted technical computing workloads than Milan.

AMD has focused on several workloads that enable product design, such as EDA tools used to simulate and optimize chip design. Large caches are essential to improve the performance of these workloads.

Verification is one of the most important tasks in chip design. Helps detect defects early before the chip is burned into silicon. Compared to Milan, Milan-X completes 66% more work in a given amount of time. This allows customers using EDA tools to complete validation and bring it to market faster, or add tests in the same amount of time to further improve design quality and robustness. ..

