



Remember when a GPU was a small, fanless video card with names like Voodoo, Matrox, Nvidia, or ATI? This simple addition gave your PC a whole new world of responsive 2D and 3D graphics. If someone told you that future versions of GPUs would eventually be used as high-performance tools for HPC, Crypto, and Generative AI. I suspect the answer would be something like What is crypto and generative AI?

The need for GPU hardware – or better accelerators – has never been greater and if the trend continues, the current high demand could continue into the near future. In terms of HPC, this trend suggests that GPUs will be expensive and hard to find in the future (unless you buy enough to go direct from the vendor).

GPUs are great for accelerating matrix operations, which are often at the heart of many HPC applications. They provide one or more SIMD (Single Instruction Multiple Data) processing units that can accelerate complex network operations in parallel. As is often noted, the HPC GPU market owes a much larger portion of the gaming market (~7x) for helping to cover the cost of hardware that allows gamers to fly across galaxies while allowing scientists to simulate galaxies.

The crypto market has changed this dynamic at the low end. In crypto terms, gaming GPUs are great at finding unique numbers quickly and generating heat. The market demand was quite high and seems to be decreasing due to the evolution of the crypto market.

At the top of the range, things are very different. According to a Yahoo Finance article, a Reports Insights report predicts:

“Global Graphics Processing Units (GPUs) Market is expected to register a CAGR of 33.5% during the period 2022-2030, driven by growing demand for scientific simulations, data analytics and artificial intelligence in processors graphics (GPU).”

Good news for vendors in the GPU market, however, for end users, the growth of LLMs (Large Language Models such as ChatGPT) has created new demand for GPUs that are not strictly part of the traditional HPC market.

Consider Inflection AI, a company developing a “personal AI” chatbot called Pi. To achieve this goal, the company built a supercomputer equipped with 22,000 NVIDIA H100 GPUs. To provide some context, Frontier, the number one system on the June 23 TOP500 list packs 37,632 GPUs.

Inflection AI’s situation is not unique. Two additional data points, as reported by SemiAnalysis, support the same trend.

“Even OpenAI can’t get enough GPUs, which significantly hampers its near-term roadmap. OpenAI can’t deploy its multimodal models due to GPU shortages.”

“For example, Bytedance, the Chinese company behind Tik Tok is supposed to order more than $1 billion worth of A800/H800 from Nvidia.”

Looking for FLOPS?

Fortunately, GPUs are not necessary for HPC; they are useful for many applications; however, the latest count (Jun-23) of TOP500 systems indicates that thirty-seven percent of machines use GPUs. This number is growing and the use of accelerators will continue as systems enter exaFLOPS territory.

As mentioned, GPUs are not necessary, but often desirable for many HPC applications. The concern for many on-premises HPC purchases and/or cloud scenarios is the availability (shortage) of GPUs across the board due to the huge demands of the booming generative AI industry. The strong market demand for “any GPU” (Nvidia, AMD, or Intel) may prompt HPC practitioners to consider CPU-only solutions to help accelerate their codes (e.g., multiple cores, AVX-512, HBM , 3D V-Cache, etc.) .

Searching for “GPU cycles” may also invite new methods. Remember that some of the original GPU applications in HPC started with a standard GPU card and a new language called “Brook” which was the precursor to CUDA and ran on some of the early GPU cards mentioned in the first paragraph. At first, the approach felt a little clunky, but the accelerations were hard to ignore. The result reshaped the HPC sector.

Recently, in an interesting move, the latest version of the AMD ROCm GPU Libraries (V5.6) supports mobile and desktop iGPUs (embedded GPUs). In a brief post on LinkedIn, HPC maven James Cuff was able to run a TensorFlow benchmark on both CPU and CPU/iGPU using a Ryzen 9 6900HX desktop processor. The same benchmark ran in 13 seconds on the CPU and completed in 3 seconds on the CPU/iGPU combination. Of course, more testing is needed, but just as the search for FLOPS turned to early GPU-based graphics cards, the search for internal GPUs can add unused and available FLOPS to the HPC mix. The big GPU pressure has arrived.

