AMD enters the AI acceleration game with broad industry support. First shipping product is the Dell PowerEdge XE9680 with AMD Instinct MI300X.
AMD has run a distant second to Nvidia in the GPU acceleration HPC market despite the fact that its accelerators power the world’s fastest supercomputer. It’s looking to gain ground with the launch of the Instinct MI300X data center GPU.
AMD CEO Lisa Su kicked off the launch event and compared the AI revolution to the Internet revolution that began 30 years ago. “But what’s different about AI is that the adoption rate is just much, much faster. So although so much has happened, the truth is, right now we’re just at the very beginning of the AI era. And we can see how it’s so capable of touching every aspect of our lives,” she said.
The company first introduced the Instinct MI300 family at CES earlier this year. It formally launched the Instinct MI300X along with its CPU-GPU hybrid chip, the Instinct MI300A, at its Advancing AI event in San Jose, Calif., on Thursday, marking its biggest challenge yet to Nvidia’s dominance in the HPC acceleration race.
The OEM support is without question. Several OEMs, including Dell Technologies, Hewlett Packard Enterprise, Lenovo and Supermicro, said they would ship servers with the MI300X accelerator card. Also, HPE, Supermicro, Gigabyte, and Atos subsidiary Eviden will ship next year with the MI300A card.
On the cloud side, the MI300X will be used to power upcoming virtual machine instances from Microsoft Azure and bare metal instances from Oracle Cloud Infrastructure. In addition, smaller cloud service providers like Aligned, Akron Energy, Cirrascale, Crusoe and Denvr Dataworks said that they would also support MI300X.
AMD also announced an update to its ROCm 6 GPU programming platform, which it promotes as an alternative to Nvidia’s CUDA programming language. The update has features optimizations for generative AI, particularly large language models, along with support for new data types, advanced graph and kernel optimizations, optimized libraries and advanced attention algorithms.
AMD Instinct MI300X GPU
Like Nvidia, AMD repurposes its commercial GPU technology for data center work, with some necessary modifications. In this case, the Instinct MI300X is based on the CDNA 3 architecture, which in turn is based on the RDNA GPU architecture.
The two different Instinct cards have two different target markets. The MI300X will target training and running inference on large language models such as Meta’s Llama2 and Bloom, while the MI300A will focus on general HPC and AI workloads.
The MI300X is a beast of the chip no matter how you slice it, and AMD CEO Lisa Su didn’t hesitate to compare the MI300X to “the competition.” It’s about the size of a drink coaster, massive by any processor standard, with 192GB of HBM3 high-bandwidth memory (2.4x the HBM3 capacity of Nvidia’s H100 card). Its memory bandwidth is 5.3 TB/s, 60% greater than the 3.3 TB/s of the H100. It also has a greater power draw, 750 W, which is more than the 700 W of the H100.
In terms of HPC performance, AMD said the MI300X can hit up to 163.4 teraflops for FP32 double precision matrix math and 81.7 teraflops of FP64 vector operations, both of which are 2.4 times faster than the H100.
For single-precision floating point math, also known as FP32, the MI300X can hit 163.4 teraflops for both matrix and vector operations. The chip’s vector performance is 2.4 times better than the H100, AMD claims.
AMD MI300A APU
AMD calls the Instinct MI300A “world’s first data center APU for HPC and AI.” APU is the company’s term for a single chip that combines CPU cores and GPU cores on the same die. AMD has been offering these types of products as desktop processors for PCs since 2008, but this is the first server product.
The MI300A uses the same Zen 4 cores as AMD’s EPYC server processors. Those cores are combined with GPU cores based on the latest GPU architecture and share 128GB of HBM3 memory. The chip’s memory bandwidth is the same as the MI300X at 5.3 TB/s, but its power draw at the low end is 550 W, which is much less than its big brother the MI300X.
For FP32 math, the MI300A can hit 122.6 teraflops for both matrix and vector operations, just a little less than the MI300X.
While the MI300A has slightly less GPU performance than the MI300X, it makes up for it in power efficiency. “The MI300A has twice the HPC performance per watt of the nearest competitor. Customers can thus fit more nodes into their overall facility power budget, and better support their sustainability goals,” said Forrest Norrod, executive vice president and general manager of AMD’s data center solutions business unit.
“The second advantage is the ability to optimize power management between the CPU and the GPU. That means dynamically shifting power from one processor to another, depending on the needs of the workload, optimizing application performance,” Norrod said.
Dell PowerEdge XE9680 with AMD Instinct MI300X
All the top OEMs committed to the Instinct cards, but only Dell is shipping a product now. The company announced an expansion of its Dell Generative AI Solutions portfolio, featuring new PowerEdge servers using the MI300X.
The PowerEdge XE9680 with AMD Instinct MI300X offers high-performance capabilities for enterprises seeking to work on customized LLMs. The server comes with eight MI300X GPUs, 192GB of 5.3 TB/s High Bandwidth Memory (HBM3) per GPU for a total coherent HBM3 capacity of 1.5 TB per server and over 21 petaflops of FP16 performance.
Along with the new server, Dell also announced the next step of Dell Generative AI Solutions, which are making it easier for organizations to deploy trustworthy GenAI. The new Dell Validated Design for Generative AI with AMD-powered AI frameworks, available next year, extends the Dell Generative AI Solutions ecosystem and will include open-source LLMs.
More AMD news from Microsoft, Meta and Oracle
As part of the big announcement, Microsoft detailed how it is deploying AMD Instinct MI300X accelerators to power the new Azure ND MI300x v5 Virtual Machine (VM) series optimized for AI workloads.
Meta shared that the company is adding AMD Instinct MI300X accelerators to its data centers in combination with ROCm 6 to power AI inferencing workloads and recognized the ROCm 6 optimizations AMD has done on the Llama 2 family of models.
Oracle unveiled plans to offer OCI bare metal compute solutions featuring AMD Instinct MI300X accelerators as well as plans to include AMD Instinct MI300X accelerators in their upcoming generative AI service.