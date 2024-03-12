



As the world rushes to take advantage of the latest wave of AI technology, one piece of high-tech hardware has become a surprisingly popular commodity: graphics processing units (GPUs).

Top-of-the-line GPUs can sell for tens of thousands of dollars, and leading manufacturer NVIDIA has a market valuation of more than $2 trillion as demand for its products soars.

GPUs are more than just high-end AI products. Cell phones, laptops, and gaming consoles also have less powerful GPUs.

By now you're probably wondering, “What exactly is a GPU?” And what makes them so special?

What is GPU?

GPUs were originally designed primarily for rapidly generating and displaying complex 3D scenes and objects, such as those involved in video games and computer-aided design software. Modern GPUs also handle tasks such as decompressing video streams.

The brain of most computers is a chip called a central processing unit (CPU). Although CPUs can be used to generate graphics scenes and decompress videos, they are typically much slower and less efficient than GPUs for these tasks. CPUs are well-suited for common computational tasks, such as word processing and web page viewing.

How is GPU different from CPU?

A typical modern CPU consists of 8 to 16 cores, each of which can process complex tasks in turn.

A GPU, on the other hand, has thousands of relatively small cores that are all designed to work together (in parallel) to achieve high overall processing speed. This makes it suitable for tasks that require a large number of simple operations that can be performed simultaneously rather than one after the other.

There are two main types of traditional GPUs.

First, there are standalone chips, which often come with add-on cards in large desktop computers. The second is the GPU, which is combined with the CPU in the same chip package and is often found in laptops and gaming consoles like the PlayStation 5. In both cases, the CPU controls what the GPU does.

Why are GPUs so useful for AI?

It turns out that GPUs can be reused for more than just generating graphical scenes.

Many machine learning techniques behind artificial intelligence (AI), such as deep neural networks, rely heavily on various forms of matrix multiplication.

This is a mathematical operation that multiplies and sums a very large set of numbers. These operations lend themselves well to parallel processing and can be performed very quickly by GPUs.

What's next for GPUs?

The number processing power of GPUs is steadily increasing due to increases in the number of cores and their operating speeds. These improvements are primarily driven by improvements in chip manufacturing by companies such as Taiwan's TSMC.

The size of individual transistors, the basic components of computer chips, is decreasing, allowing more transistors to be placed in the same physical space.

However, this is not all. While traditional GPUs are useful for AI-related computational tasks, they are not optimal.

Just as GPUs were originally designed to speed up computers by providing graphics-specific processing, there are also accelerators designed to speed up machine learning tasks. These accelerators are often referred to as data center GPUs.

Some of the most popular accelerators made by companies like AMD and NVIDIA started out as traditional GPUs. Over time, the design has evolved to better handle various machine learning tasks, including supporting more efficient Brainfloat number formats.

Other accelerators, such as Google's Tensor processing units and TenstorrentsTensix cores, were designed from the ground up to accelerate deep neural networks.

GPUs and other AI accelerators in data centers typically have significantly more memory than traditional GPU add-on cards, which is essential for training large-scale AI models. The larger the AI ​​model, the more powerful and accurate it is.

To further speed up training and process even larger AI models such as ChatGPT, many datacenter GPUs can be pooled together to form a supercomputer. This requires more complex software to properly utilize the available numerical processing power. Another approach is to create a single, very large accelerator, such as the wafer-scale processors made by Cerebras.

Are specialized chips the future?

The CPU hasn't stopped either. Modern CPUs from AMD and Intel include low-level instructions that speed up the numerical processing required for deep neural networks. This addition is primarily useful for inference tasks that use AI models that have already been developed elsewhere.

Training AI models in the first place still requires accelerators like large GPUs.

It will now be possible to create more specialized accelerators for specific machine learning algorithms than ever before. For example, a company called Groq recently developed a language processing unit (LPU) specifically designed to run large language models along ChatGPT.

However, creating these specialized processors requires significant engineering resources. History has shown that the usage and popularity of certain machine learning algorithms tends to peak and then decline, so expensive specialized hardware can quickly become obsolete.

However, for the average consumer, that is unlikely to be an issue. The GPU and other chips in the products you use may continue to silently get faster.

Conrad Sanderson, CSIRO Research Scientist and Team Leader

This article is republished from The Conversation under a Creative Commons license. Read the original article.

