Deep Learning

NVIDIA H200 at SC23 - The Most Powerful GPU for AI & HPC

November 13, 2023
5 min read
NVIDIA-H200-blog-graphic.png

Exxact Offers a Glimpse into the NVIDIA H200, the Highest Performing GPU for AI

The past three years has highlighted the extreme integrations of artificial intelligence in everyday workloads. And Generative AI has emerged, delivering these capabilities to the mainstream. At Exxact Corporation, our mission is to deliver high performance GPU-accelerated solutions in hopes to inspire innovation. We are excited to soon offer solutions featuring the new NVIDIA H200 Tensor Core GPU, the newest addition to NVIDIA’s Hopper Architecture of AI accelerators.

About the NVIDIA H200 Tensor Core GPU

H200 is the newest addition to NVIDIA’s leading AI and high-performance data center GPU portfolio, bringing massive compute to data centers. To maximize that compute performance, H200 is the world’s first GPU with HBM3e memory for 4.8TB/s of memory bandwidth, a 43% increase over H100. H200 also expands GPU memory capacity to 141GB, nearly double the H100’s 80GB. The combination of faster and larger HBM memory accelerates performance of computationally intensive generative AI and HPC applications, while meeting the evolving demands of growing model sizes.

Based on specifications comparing H100 and H200, the raw GPU performance remains the same. All the improvements to the H200 is the introduction of the faster and higher capacity 141GB of HBM3e.

Specification H100 SXM H200 SXM
FP64 teraFLOPS 34 34
FP64 Tensor core teraFLOPS 67 67
FP32 teraFLOPS 67 67
TF32 Tensor core teraFLOPS 989 989
BFLOAT16 Tensor core teraFLOPS 1,979 1,979
FP16 Tensor core teraFLOPS 1,979 1,979
FP8 Tensor core teraFLOPS 3,958 3,958
INT8 Tensor Core TOPS 3,958 3,958
GPU Memory 80GB HBM3 141GB HBM3e
GPU Memory Bandwidth 3.35TB/s 4.8TB/s
Decoders 7 NVDEC

7 JPEG
7 NVDEC

7 JPEG
Confidential Computing Supported Supported
Max Thermal Design power (TDP) UP to 700W (configurable) up to 700W (configurable)
Multi-instance GPUs Up to 7 MIGS at 10GB each up to 7 MIGS at 16.5GB each
Form Factor SXM SXM
Interconnect NVLink: 900GB/s PCIe Gen5: 128GB/s NVLink: 900GB/s PCIe Gen 5 128GB/s
Server Options NVIDIA HGX H100 NVIDIA HGX H200
NVIDIA Al Enterprise Add-on Add-on

With a larger memory buffer, Exxact customers will be able to rely on NVIDIA H200 GPUs for large language models and address a diverse range of inference needs. The NVIDIA H200 will be found in Exxact servers featuring the NVIDIA HGX H200 system board available as a server building block in the form of an integrated baseboard in four or eight H200 SXM5 GPUs. An Eight-Way HGX H200 offers full GPU-to-GPU bandwidth through NVIDIA NVSwitch delivering over 32 petaFLOPS of FP8 deep learning compute and over 1.1TB of aggregate HBM memory.

hgx h200 NVIDIA

NVIDIA Hopper Architecture Advancements

NVIDIA H200 Tensor Core GPU built on the Hopper Architecture utilizes a Transformer Engine and fourth-generation Tensor Cores to speed up fine-tuning by 5.5X over the Ampere-based A100 Tensor Core GPU. This performance increase allows enterprises and AI practitioners to quickly optimize and deploy generative AI to benefit their business. Compared to fully training foundation models from scratch, fine-tuning offers better energy efficiency and the fastest access to customized solutions needed to grow business.

Energy efficiency and TCO also reach new levels. The NVIDIA H200’s cutting-edge HBM3e memory offers unparalleled performance, all within the same power profile as H100. AI factories and at-scale supercomputing systems deliver an economic edge that propels the AI and scientific community forward. For at-scale deployments, H200 systems provide 5X more energy savings and 4X better cost of ownership savings over the NVIDIA A100.

With the increased power and scalability, seamless communication between every GPU in a server cluster is essential. The fourth generation NVLink accelerates multi-GPU input and output (IO) across eight GPU servers at 900GB/s bidirectional per GPU, over 7X the bandwidth of PCIe Gen5. NVSwitch now supports SHARP in-network computing, previously only available on InfiniBand, which provides a 3X increase in all-reduce throughput across eight GPU servers compared to the previous-generation A100 systems. When combined with the external NVIDIA NVLink Switch, the NVLink Switch System* will enable scaling multi-GPU IO across multiple servers at 900GB/s bidirectional per NVIDIA H200 GPU.

When to expect NVIDIA H200

Expect availability of the NVIDIA H200 in 2024. Talk to an Exxact representative today for updates on a Exxact server featuring NVIDIA HGX H200. Accelerate your computing infrastructure, increase your business’s productivity, and develop AI models with the highest performing AI accelerator.

Have any questions? Looking for an HGX H200 or an alternative to power the most demanding workloads? Contact us today or explore our various customizable Deep Learning Training server platforms.

NVIDIA-H200-blog-graphic.png
Deep Learning

NVIDIA H200 at SC23 - The Most Powerful GPU for AI & HPC

November 13, 20235 min read

Exxact Offers a Glimpse into the NVIDIA H200, the Highest Performing GPU for AI

The past three years has highlighted the extreme integrations of artificial intelligence in everyday workloads. And Generative AI has emerged, delivering these capabilities to the mainstream. At Exxact Corporation, our mission is to deliver high performance GPU-accelerated solutions in hopes to inspire innovation. We are excited to soon offer solutions featuring the new NVIDIA H200 Tensor Core GPU, the newest addition to NVIDIA’s Hopper Architecture of AI accelerators.

About the NVIDIA H200 Tensor Core GPU

H200 is the newest addition to NVIDIA’s leading AI and high-performance data center GPU portfolio, bringing massive compute to data centers. To maximize that compute performance, H200 is the world’s first GPU with HBM3e memory for 4.8TB/s of memory bandwidth, a 43% increase over H100. H200 also expands GPU memory capacity to 141GB, nearly double the H100’s 80GB. The combination of faster and larger HBM memory accelerates performance of computationally intensive generative AI and HPC applications, while meeting the evolving demands of growing model sizes.

Based on specifications comparing H100 and H200, the raw GPU performance remains the same. All the improvements to the H200 is the introduction of the faster and higher capacity 141GB of HBM3e.

Specification H100 SXM H200 SXM
FP64 teraFLOPS 34 34
FP64 Tensor core teraFLOPS 67 67
FP32 teraFLOPS 67 67
TF32 Tensor core teraFLOPS 989 989
BFLOAT16 Tensor core teraFLOPS 1,979 1,979
FP16 Tensor core teraFLOPS 1,979 1,979
FP8 Tensor core teraFLOPS 3,958 3,958
INT8 Tensor Core TOPS 3,958 3,958
GPU Memory 80GB HBM3 141GB HBM3e
GPU Memory Bandwidth 3.35TB/s 4.8TB/s
Decoders 7 NVDEC

7 JPEG
7 NVDEC

7 JPEG
Confidential Computing Supported Supported
Max Thermal Design power (TDP) UP to 700W (configurable) up to 700W (configurable)
Multi-instance GPUs Up to 7 MIGS at 10GB each up to 7 MIGS at 16.5GB each
Form Factor SXM SXM
Interconnect NVLink: 900GB/s PCIe Gen5: 128GB/s NVLink: 900GB/s PCIe Gen 5 128GB/s
Server Options NVIDIA HGX H100 NVIDIA HGX H200
NVIDIA Al Enterprise Add-on Add-on

With a larger memory buffer, Exxact customers will be able to rely on NVIDIA H200 GPUs for large language models and address a diverse range of inference needs. The NVIDIA H200 will be found in Exxact servers featuring the NVIDIA HGX H200 system board available as a server building block in the form of an integrated baseboard in four or eight H200 SXM5 GPUs. An Eight-Way HGX H200 offers full GPU-to-GPU bandwidth through NVIDIA NVSwitch delivering over 32 petaFLOPS of FP8 deep learning compute and over 1.1TB of aggregate HBM memory.

NVIDIA Hopper Architecture Advancements

NVIDIA H200 Tensor Core GPU built on the Hopper Architecture utilizes a Transformer Engine and fourth-generation Tensor Cores to speed up fine-tuning by 5.5X over the Ampere-based A100 Tensor Core GPU. This performance increase allows enterprises and AI practitioners to quickly optimize and deploy generative AI to benefit their business. Compared to fully training foundation models from scratch, fine-tuning offers better energy efficiency and the fastest access to customized solutions needed to grow business.

Energy efficiency and TCO also reach new levels. The NVIDIA H200’s cutting-edge HBM3e memory offers unparalleled performance, all within the same power profile as H100. AI factories and at-scale supercomputing systems deliver an economic edge that propels the AI and scientific community forward. For at-scale deployments, H200 systems provide 5X more energy savings and 4X better cost of ownership savings over the NVIDIA A100.

With the increased power and scalability, seamless communication between every GPU in a server cluster is essential. The fourth generation NVLink accelerates multi-GPU input and output (IO) across eight GPU servers at 900GB/s bidirectional per GPU, over 7X the bandwidth of PCIe Gen5. NVSwitch now supports SHARP in-network computing, previously only available on InfiniBand, which provides a 3X increase in all-reduce throughput across eight GPU servers compared to the previous-generation A100 systems. When combined with the external NVIDIA NVLink Switch, the NVLink Switch System* will enable scaling multi-GPU IO across multiple servers at 900GB/s bidirectional per NVIDIA H200 GPU.

When to expect NVIDIA H200

Expect availability of the NVIDIA H200 in 2024. Talk to an Exxact representative today for updates on a Exxact server featuring NVIDIA HGX H200. Accelerate your computing infrastructure, increase your business’s productivity, and develop AI models with the highest performing AI accelerator.

Have any questions? Looking for an HGX H200 or an alternative to power the most demanding workloads? Contact us today or explore our various customizable Deep Learning Training server platforms.