Category top banner badge image
The most powerful end-to-end AI supercomputing platform.

NVIDIA HGX H100 Solutions

value propositon

Extreme HPC Performance

NVIDIA HGX H100 8-GPU delivers 268 teraFLOPS of FP64 performance, over 3x last generation NVIDIA HGX A100.

value propositon

Fourth-Gen NVLink

The NVIDIA HGX H100 utilizes Fourth Generation NVLink for up to 900GB/ of GPU-to-GPU interconnect, almost 10X higher than PCIe Gen4.

value propositon

Third Gen NVIDIA NVSwitch

NVIDIA NVSwitchâ„¢ creates a unified networking fabric that allows the entire node of multiple HGX to function as a single gigantic GPU.

NVIDIA HGX H100 Platforms

Solution image

NVIDIA HGX 4x H100 Dual Intel Xeon Scalable4U

TS4-101818584

Starting at

$137,870.70

Highlights
CPU2x 4th/5th Gen Intel Xeon Scalable CPU
GPUNVIDIA H100 HGX - 4x NVIDIA H100 SXM5 80GB + NVLink Switch System
MEMUp to 8TB DDR5 ECC Memory
STO6x 2.5" NVMe & SATA Hotswap
NET2x 10GBASE-T or 25Gbe SFP28
Solution image

NVIDIA HGX 8x H100 Dual AMD EPYC 90048U

TS4-193475697

Starting at

$256,810.40

Highlights
CPU2x AMD EPYC 9004 CPU
GPUNVIDIA H100 HGX - 8x NVIDIA H100 SXM5 80GB + NVLink Switch System
MEMUp to 6TB DDR5 ECC Memory
STO24x 2.5" Hotswap (16x NVMe, 8x SATA)
NET8x PCIe 5 LP slots Connected to PLX
Solution image

NVIDIA HGX 8x H100 Dual Intel Xeon Scalable8U

TS4-117847628

Starting at

$265,034.00

Highlights
CPU2x 4th/5th Gen Intel Xeon Scalable CPU
GPUNVIDIA H100 HGX - 8x NVIDIA H100 SXM5 80GB + NVLink Switch System
MEMUp to 8TB DDR5 ECC Memory
STO24x 2.5" Hotswap (16x NVMe, 8x SATA)
NET8x PCIe 5 LP slots Connected to PLX

Build your ideal system

Need a bit of help? Contact our sales engineers directly.

HGX H100 Specifications - 4-GPU & 8- GPU

HGX H100 4-GPU HGX H100 8-GPU
GPUs 4x NVIDIA H100 80GB SXM5 8x NVIDIA H100 80GB SXM5
GPU Memory 320GB HBM3 640GB HBM3
Per GPU Memory Bandwidth 3.35TB/s 3.35TB/s
Aggregate Memory Bandwidth 13TB/s 27TB/s
NVLink Generation | Speed 4th Gen NVLink | 900GB/s 4th Gen NVLink | 900GB/s
NVSwitch Generation | Speed N/A 3rd Gen NVSwitch | 900GB/s
FP64 134 TFLOPS 268 TFLOPS
FP64 Tensor Core 268 TFLOPS 535 TFLOPS
FP32 268 TFLOPS 535 TFLOPS
TF32 Tensor Core 3958 TFLOPS 7915 TFLOPS
FP16 Tensor Core 7915 TFLOPS 15830 TFLOPS
FP8 Tensor Core 15830 TFLOPS 31662 TFLOPS

High-Performance Computing with NVIDIA H100

To unlock next-generation discoveries, scientists look to high performance computing to understand complex molecules for drug discovery, simulation of mechanical and fluid dynamics, parse through length genome sequences, extrapolate data science research, and train next generation generative AI.

H100 continues to deliver high performance innovation, tripling the FLOPS in double precision FP64 Tensor Cores over the A100. AI-fused HPC applications can also leverage H100’s TF32 precision to achieve one petaflop of throughput for single-precision matrix-multiply operations, with zero code changes. 

  1. Fine-tuning LoRA (GPT-40GB) 8 GPUs: global trian batch size 128 (sequences), seq-length 256 (tokens)
  2. Small Model Training (GPT-7B, GPT-13B) 8-GPU averaged: global train batch size: 512 (sequences), seq-length: 2048 (tokens)
  3. Large Model Training (GPT- 175B) 256 GPUs (32x Nodes): global train batch size: 2048 (sequences), seq-length: 2048 (tokens)
  1. 3D fast Fourier transform (FFT) (4K^3) throughput. HGX A100 8-GPU with HDR IB network vs HGX H100 8-GPU NVLink Switch System NDR IB.
  2. Genome Sequencing (Smith-Waterman): 1x A100 vs 1x H100
AI can Accelerate the Workflow of any Industry

Start Training Your Own AI Model Today

NVIDIA AI Enterprise enables users to harness the power of AI through an optimized and streamlined development and deployment framework. Coupled with NVIDIA Enterprise Support and Training Services, developers can leverage a professional to assist and teach the best AI practices. Train and deploy the best AI model, tailored for your deep learning goals today.

logo

Partnerships

nvidia
pny
panasas
ansys
Bright Computing
BeeGFS