The new Nvidia HGX H200 has been designed to support the high performance computing workloads required to train generative AI models.
Nvidia has announced a new AI computing platform called Nvidia HGX H200, a turbocharged version of the company’s Nvidia Hopper architecture powered by its latest GPU offering, the Nvidia H200 Tensor Core.
The company also is teaming up with HPE to offer a supercomputing system, built on the Nvidia Grace Hopper GH200 Superchips, specifically designed for generative AI training.
A surge in enterprise interest in AI has fueled demand for Nvidia GPUs to handle generative AI and high-performance computing workloads. Its latest GPU, the Nvidia H200, is the first to offer HBM3e, high bandwidth memory that is 50% faster than current HBM3, allowing for the delivery of 141GB of memory at 4.8 terabytes per second, providing double the capacity and 2.4 times more bandwidth than its predecessor, the Nvidia A100.
Nvidia unveiled the first HBM3e processor, the GH200 Grace Hopper Superchip platform, in August “to meet [the] surging demand for generative AI,” founder and CEO of Nvidia, Jensen Huang, said at the time.
The introduction of the Nvidia H200 will lead to further performance leaps, the company said in a statement, adding that when compared to its H100 offering, the new architecture will nearly double the inference speed on Meta’s 70 billion-parameter LLM Llama-2. Parameters relate to how neural networks are configured.
“To create intelligence with generative AI and HPC applications, vast amounts of data must be efficiently processed at high speed using large, fast GPU memory,” said Ian Buck, vice president of hyperscale and HPC at Nvidia in a statement accompanying the announcement. “With Nvidia H200, the industry’s leading end-to-end AI supercomputing platform just got faster to solve some of the world’s most important challenges.”
H200-powered systems are expected to start shipping in the second quarter of 2024, with the Nvidia H200 Tensor Core GPU available in HGX H200 server boards with four- and eight-way configurations.
An eight-way HGX H200 provides over 32 petaflops of FP8 deep learning compute and 1.1TB of aggregate high-bandwidth memory for the highest performance in generative AI and HPC applications, Nvidia said.
A petaflop is a measure of performance for a computer that can calculate at least one thousand trillion, or one quadrillion, floating point operations per second. An FP8 is an eight-bit floating point format specification, designed to ease the sharing of deep learning networks between hardware platforms.
The H200 can be deployed in any type of data center, including on premises, cloud, hybrid-cloud and edge and will also be available in the GH200 Grace Hopper Superchip platform.
Nvidia powers new HPE AI training solution with GH200 Grace Hopper Superchips
Two weeks after it was revealed that the UK’s Isambard-AI supercomputer would be built with HPE’s Cray EX supercomputer technology and powered by Nvidia GH200 Grace Hopper Superchips, the two companies have once again teamed up to provide a new supercomputing turnkey system that supports the development of generative AI.
The new system comprises preconfigured and pretested AI and machine learning software, and also includes liquid-cooled supercomputers, accelerated compute, networking, storage, and services. Based on the same architecture as Isambard-AI, the solution will integrate with HPE Cray supercomputing technology and be powered by Nvidia Grace Hopper GH200 Superchips, allowing AI research centers and large enterprises to speed up the training of a model by 2-3 times.
“Together, this solution offers organizations the unprecedented scale and performance required for big AI workloads, such as large language model (LLM) and deep learning recommendation model (DLRM) training,” HPE in a press release.
The system will be generally available in December through HPE in more than 30 countries.