NVIDIA has recently announced the launch of its next-generation AI supercomputer chips, which are expected to have a significant impact on the field of AI and deep learning. These new chips are anticipated to drive advancements in technologies such as large language models (LLMs) and will be utilized for various applications including weather and climate prediction, drug discovery, quantum computing, and more.
The key product in NVIDIA’s new lineup is the HGX H200 GPU, which is based on the company’s “Hopper” architecture and is set to replace the popular H100 GPU. This new chip utilizes HBM3e memory, which is faster and has more capacity compared to its predecessor, the NVIDIA A100. According to NVIDIA, the HGX H200 delivers 141GB of memory at 4.8 terabytes per second, almost doubling the capacity and providing 2.4 times more bandwidth than the NVIDIA A100. These enhanced capabilities make the HGX H200 well-suited for handling complex tasks associated with large language models.
In terms of AI benefits, NVIDIA claims that the HGX H200 is capable of doubling the inference speed on Llama 2, a 70 billion-parameter LLM, in comparison to the H100. The GPU will be available in 4- and 8-way configurations, compatible with both the software and hardware in H100 systems, and is expected to be deployed in various data centers, including on-premises, cloud, hybrid-cloud, and edge environments. Major cloud providers such as Amazon Web Services, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure are set to adopt the new technology, with an anticipated launch in Q2 2024.
In addition to the HGX H200 GPU, NVIDIA’s other key product is the GH200 Grace Hopper “superchip,” which combines the HGX H200 GPU with an Arm-based NVIDIA Grace CPU using NVLink-C2C interlink. This superchip is designed to accelerate complex AI and HPC applications running terabytes of data, enabling scientists and researchers to tackle the world’s most challenging problems. The GH200 is expected to be used in over 40 AI supercomputers across global research centers, system makers, and cloud providers, including partnerships with Dell, Eviden, Hewlett Packard Enterprise (HPE), Lenovo, QCT, and Supermicro. Notable deployments include HPE’s Cray EX2500 supercomputers, which are set to incorporate quad GH200s, scaling up to tens of thousands of Grace Hopper Superchip nodes.
One of the most significant installations featuring the GH200 superchip is the JUPITER system, located at the Jülich facility in Germany. Once installed in 2024, JUPITER is expected to become the “world’s most powerful AI system” and will utilize liquid-cooled architecture, comprising close to 24,000 NVIDIA GH200 Superchips interconnected with the NVIDIA Quantum-2 InfiniBand networking platform. This powerful supercomputer is poised to drive scientific breakthroughs in areas such as climate and weather prediction, drug discovery, quantum computing, and industrial engineering.
The new technologies are pivotal for NVIDIA, as the company generates the majority of its revenue from the AI and data center segments. The company recently achieved record revenue in this area, totaling $10.32 billion in the last quarter alone. With the introduction of the new GPU and superchip, NVIDIA aims to build on this success and continue its dominance in the AI and data center market.
Furthermore, these advancements emphasize NVIDIA’s commitment to pushing the boundaries of AI and supercomputing technologies, as evidenced by the company’s recent breakthrough in AI training benchmark using older H100 technology. The introduction of the new chips is expected to further extend NVIDIA’s lead over its rivals in the AI sector.
In conclusion, NVIDIA’s next-generation AI supercomputer chips are poised to drive significant advancements in AI, deep learning, and large language models, with applications spanning across various industries and research domains. The deployment of these new technologies is anticipated to accelerate scientific breakthroughs and enable researchers and organizations to tackle complex challenges using the power of AI and supercomputing.