Molecular Dynamics

NVIDIA GPU Benchmarks AMBER 22

October 30, 2023
10 min read
EXX-Blog-NVIDIA-GPU-Benchmarks-for-AMBER-22.jpg
Updated: 10/30/2023 > [NEW] NVIDIA RTX 5000 and RTX 4500 Ada + RTX 4080 Benchmarks
Updated: 03/21/2023 > [NEW] NVIDIA RTX 6000 Ada Generation Benchmarks
Updated: 02/07/2023 > [NEW] NVIDIA H100 (PCIe)and RTX 4070 Ti Benchmarks
Updated: 10/18/2022 > [NEW] NVIDIA RTX 4090 Benchmarks
Last Update: 06/23/2022 > Benchmarks Uploaded

AMBER 22 GPU Benchmarks for Molecular Dynamics with NVIDIA Professional and Data Center GPUs

The following Amber 22 Benchmarks were performed on an Exxact AMBER Certified MD System using the AMBER 22 Benchmark Suite with the following GPUs:

*All benchmarks were performed using a single GPU configuration using Amber 22 Update 1 & AmberTools 22 Update 1. NVIDIA CUDA 11.4 was also used for these benchmarks.

**NVIDIA GeForce RTX GPUs were tested on an Exxact workstation and can have a maximum of a 2-way configuration. All other NVIDIA Professional GPUs (RTX and Data Center GPUs) are tested in an Exxact server and support 8-way GPU configuration.*** Since AMBER computations are only performed by GPUs via CUDA, the variation between CPUs from workstation and server systems should have little effect on comparisons between benchmarks.

Quick AMBER GPU Benchmark takeaways

  • NVIDIA Ada Lovelace Generation GPUs outperform all Ampere models by a long shot. While there is a price increase, the performance delivered, and higher energy efficiency can play a large part in choosing your next GPUs.
    • NVIDIA RTX 4090 is a definite winner as the single most powerful GPU in our tests.
    • RTX 6000 Ada has the same GPU dies as 4090. Built for peak reliability, the RTX 6000 Ada trade clock performance for a larger 48GB memory capacity.
    • RTX 5000 and 4500 are performing well above last generation's flagship RTX A6000. These might be the new best GPUs for AMBER workloads considering cost and performance.
    • Even the mid-range consumer card RTX 4070Ti shows considerable performance over the RTX 3090
  • NVIDIA H100 is on par as 3rd most powerful (behind the RTX 4090 and RTX 6000 Ada) winning in only a couple of tests. However, the H100 offers far more scalability due to its data center nature! Leverage up to 8x NVIDIA H100s in a server.
  • For the larger simulations, such as STMV Production NPT 4fs, the high speed memory, memory capacity, and GPU clock speed play a large factor in performance. H100, RTX 6000, and 4090 dominate in this department.
  • For smaller simulations, the options are more wide. The 4070 Ti shows promising performance, but the RTX 5000 Ada and RTX 4080 deliver exceptional performance, tailing behind the bigger and better RTX 6000 and RTX 4090.

Interested in getting faster results?
Learn more about the only AMBER Certified GPU Systems starting around $6,000


Exxact Test Bench System Specs:

System SKUVWS-148320247TS4-173535991
Workstation or ServerWorkstationServer
Nodes 11
Processor / Count 1x AMD TR PRO 5995WX2x AMD EPYC 7552
Total Logical Cores 6496
Memory 256GB DDR4512GB DDR4 ECC
Storage 4TB NVMe SSD2.84TB NVMe SSD
OS Centos 7Centos 7
CUDA Version 12.012.0
AMBER Version 2222

GPU Benchmark Overview

RTX (Ada)RTX (Ada)RTX (Ada)Data Center (Hopper)GeForce (Ada)GeForce (Ada)GeForce (Ada)Data Center (Ampere)RTX (Ampere)RTX (Ampere)RTX (Ampere)RTX (Ampere)RTX (Ampere)GeForce (Ampere)GeForce (Ampere)
GPU/Benchmark RTX 6000 Ada RTX 5000 Ada RTX 4500 Ada H100 PCIe RTX 4090 RTX 4080 RTX 4070 Ti A100 PCIe RTX A6000 RTX A5500 RTX A5000 RTX A4500 RTX A4000 RTX 3090 RTX 3080
JAC Production NVE 4fs 1659.72 1502.87 1276.60 1479.32 1681.55 1549.28 1322.64 1199.22 1101.29 1092.34 1008.05 939.41 810.00 1196.50 1101.24
JAC Production NPT 4fs 1604.32 1470.94 1242.04 1424.90 1598.08 1508.34 1262.40 1194.50 1084.37 1062.12 992.14 927.52 803.02 1157.76 1086.21
JAC Production NVE 2fs 881.64 799.28 659.21 779.95 892.16 822.80 710.09 611.08 586.09 571.70 535.01 498.65 429.67 632.19 585.81
JAC Production NPT 2fs 846.02 772.04 640.98 701.09 851.84 790.88 666.18 610.09 560.05 541.00 505.58 485.37 412.73 595.28 557.60
FactorIX Production NVE 2fs 456.82 386.69 284.79 389.18 462.07 372.48 301.03 271.36 256.10 228.78 214.13 189.39 154.45 264.78 234.58
FactorIX Production NPT 2fs 418.98 361.79 271.28 357.88 440.83 354.01 279.19 252.87 241.63 223.14 206.78 181.13 150.12 248.65 217.50
Cellulose Production NVE 2fs 116.86 91.61 63.27 119.27 129.48 91.85 69.30 85.23 59.52 51.71 47.09 39.91 31.26 63.23 53.44
Cellulose Production NPT 2fs 108.79 85.88 60.86 108.91 120.56 86.87 64.65 77.98 55.50 49.74 45.71 38.74 30.34 58.30 49.69
STMV Production NPT 4fs 68.50 51.73 35.12 70.15 79.81 54.30 37.31 52.02 37.01 33.70 30.87 26.52 20.27 38.65 32.18
TRPCage GB 2fs 1465.03 1453.69 1440.36 1413.28 1492.86 1567.07 1519.47 1040.61 1166.26 1187.99 1235.49 1183.58 1244.75 1225.53 1332.27
Myoglobin GB 2fs 1005.85 859.04 749.58 1094.48 905.33 837.89 757.91 661.22 650.48 595.17 586.42 531.74 492.48 621.73 619.67
Nucleosome GB 2fs 31.75 25.98 19.24 37.68 36.30 27.54 21.34 29.66 20.37 15.71 15.60 11.92 11.02 21.08 17.72

AMBER 22 GPU Benchmark: JAC Production NVE 4fs

JAC Production NVE 4FS AMBER Benchmark

AMBER 22 GPU Benchmark: JAC Production NPT 4fs

JAC Production NPT 4FS AMBER Benchmark

AMBER 22 GPU Benchmark: JAC Production NVE 2fs

JAC Production NVE 2FS AMBER Benchmark

AMBER 22 GPU Benchmark: JAC Production NPT 2fs

JAC Production NPT 2FS AMBER Benchmark

AMBER 22 GPU Benchmark: FactorIX Production NVE 2fs

FactorIX Production NVE 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: FactorIX Production NPT 2fs

FactorIX Production NPT 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Cellulose Production NVE 2fs

Cellulose Production NVE 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Cellulose Production NPT 2fs

Cellulose Production NPT 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: STMV Production NPT 4fs

STMV Production NPT 4fs AMBER Benchmark

AMBER 22 GPU Benchmark: TRPCage GB 2fs

TRPCage GB 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Myoglobin GB 2fs

Myoglobin GB 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Nucleosome GB 2fs

Nucleosome GB 2fs AMBER Benchmark

Note about AMBER Benchmarks (From Dave Cerutti)

We take as benchmarks four periodic systems spanning a range of system sizes and compositions. The smallest Dihydrofolate Reductase (DHFR) case is a 159-residue protein in water, weighing in at 23,588 atoms. Next, from the human blood clotting system, Factor IX is a 379-residue protein also in a box of water, totaling 90,906 atoms. The larger cellulose system, with 408,609 atoms, has a greater content of macromolecules in it: the repeating sugar polymer constitutes roughly a sixth of the atoms in the system. Finally, the very large simulation of satellite tobacco mosaic virus (STMV), a gargantuan 1,067,095 atom system, also has an appreciable macromolecule content but is otherwise another collection of proteins in water. (source http://ambermd.org/GPUPerformance.php)

What is AMBER Molecular Dynamics Package?

AMBER is a molecular dynamics software package that simulates molecular mechanical force fields. AMBER (Assisted Model Building with Energy Refinement) is a family of force fields for molecular dynamics of biomolecules originally developed by Peter Kollman’s group at the University of California, San Francisco. The AMBER MD software package is maintained by active collaboration between David Case at Rutgers University, Tom Cheatham at the University of Utah, Adrian Roitberg at the University of Florida, Ken Merz at Michigan State University, Carlos Simmerling at Stony Brook University, Ray Luo at UC Irvine, and Junmei Wang at Encysive Pharmaceuticals.


Have any questions?
Contact Exxact Today


Topics

EXX-Blog-NVIDIA-GPU-Benchmarks-for-AMBER-22.jpg
Molecular Dynamics

NVIDIA GPU Benchmarks AMBER 22

October 30, 202310 min read
Updated: 10/30/2023 > [NEW] NVIDIA RTX 5000 and RTX 4500 Ada + RTX 4080 Benchmarks
Updated: 03/21/2023 > [NEW] NVIDIA RTX 6000 Ada Generation Benchmarks
Updated: 02/07/2023 > [NEW] NVIDIA H100 (PCIe)and RTX 4070 Ti Benchmarks
Updated: 10/18/2022 > [NEW] NVIDIA RTX 4090 Benchmarks
Last Update: 06/23/2022 > Benchmarks Uploaded

AMBER 22 GPU Benchmarks for Molecular Dynamics with NVIDIA Professional and Data Center GPUs

The following Amber 22 Benchmarks were performed on an Exxact AMBER Certified MD System using the AMBER 22 Benchmark Suite with the following GPUs:

*All benchmarks were performed using a single GPU configuration using Amber 22 Update 1 & AmberTools 22 Update 1. NVIDIA CUDA 11.4 was also used for these benchmarks.

**NVIDIA GeForce RTX GPUs were tested on an Exxact workstation and can have a maximum of a 2-way configuration. All other NVIDIA Professional GPUs (RTX and Data Center GPUs) are tested in an Exxact server and support 8-way GPU configuration.*** Since AMBER computations are only performed by GPUs via CUDA, the variation between CPUs from workstation and server systems should have little effect on comparisons between benchmarks.

Quick AMBER GPU Benchmark takeaways

  • NVIDIA Ada Lovelace Generation GPUs outperform all Ampere models by a long shot. While there is a price increase, the performance delivered, and higher energy efficiency can play a large part in choosing your next GPUs.
    • NVIDIA RTX 4090 is a definite winner as the single most powerful GPU in our tests.
    • RTX 6000 Ada has the same GPU dies as 4090. Built for peak reliability, the RTX 6000 Ada trade clock performance for a larger 48GB memory capacity.
    • RTX 5000 and 4500 are performing well above last generation's flagship RTX A6000. These might be the new best GPUs for AMBER workloads considering cost and performance.
    • Even the mid-range consumer card RTX 4070Ti shows considerable performance over the RTX 3090
  • NVIDIA H100 is on par as 3rd most powerful (behind the RTX 4090 and RTX 6000 Ada) winning in only a couple of tests. However, the H100 offers far more scalability due to its data center nature! Leverage up to 8x NVIDIA H100s in a server.
  • For the larger simulations, such as STMV Production NPT 4fs, the high speed memory, memory capacity, and GPU clock speed play a large factor in performance. H100, RTX 6000, and 4090 dominate in this department.
  • For smaller simulations, the options are more wide. The 4070 Ti shows promising performance, but the RTX 5000 Ada and RTX 4080 deliver exceptional performance, tailing behind the bigger and better RTX 6000 and RTX 4090.

Interested in getting faster results?
Learn more about the only AMBER Certified GPU Systems starting around $6,000


Exxact Test Bench System Specs:

System SKUVWS-148320247TS4-173535991
Workstation or ServerWorkstationServer
Nodes 11
Processor / Count 1x AMD TR PRO 5995WX2x AMD EPYC 7552
Total Logical Cores 6496
Memory 256GB DDR4512GB DDR4 ECC
Storage 4TB NVMe SSD2.84TB NVMe SSD
OS Centos 7Centos 7
CUDA Version 12.012.0
AMBER Version 2222

GPU Benchmark Overview

RTX (Ada)RTX (Ada)RTX (Ada)Data Center (Hopper)GeForce (Ada)GeForce (Ada)GeForce (Ada)Data Center (Ampere)RTX (Ampere)RTX (Ampere)RTX (Ampere)RTX (Ampere)RTX (Ampere)GeForce (Ampere)GeForce (Ampere)
GPU/Benchmark RTX 6000 Ada RTX 5000 Ada RTX 4500 Ada H100 PCIe RTX 4090 RTX 4080 RTX 4070 Ti A100 PCIe RTX A6000 RTX A5500 RTX A5000 RTX A4500 RTX A4000 RTX 3090 RTX 3080
JAC Production NVE 4fs 1659.72 1502.87 1276.60 1479.32 1681.55 1549.28 1322.64 1199.22 1101.29 1092.34 1008.05 939.41 810.00 1196.50 1101.24
JAC Production NPT 4fs 1604.32 1470.94 1242.04 1424.90 1598.08 1508.34 1262.40 1194.50 1084.37 1062.12 992.14 927.52 803.02 1157.76 1086.21
JAC Production NVE 2fs 881.64 799.28 659.21 779.95 892.16 822.80 710.09 611.08 586.09 571.70 535.01 498.65 429.67 632.19 585.81
JAC Production NPT 2fs 846.02 772.04 640.98 701.09 851.84 790.88 666.18 610.09 560.05 541.00 505.58 485.37 412.73 595.28 557.60
FactorIX Production NVE 2fs 456.82 386.69 284.79 389.18 462.07 372.48 301.03 271.36 256.10 228.78 214.13 189.39 154.45 264.78 234.58
FactorIX Production NPT 2fs 418.98 361.79 271.28 357.88 440.83 354.01 279.19 252.87 241.63 223.14 206.78 181.13 150.12 248.65 217.50
Cellulose Production NVE 2fs 116.86 91.61 63.27 119.27 129.48 91.85 69.30 85.23 59.52 51.71 47.09 39.91 31.26 63.23 53.44
Cellulose Production NPT 2fs 108.79 85.88 60.86 108.91 120.56 86.87 64.65 77.98 55.50 49.74 45.71 38.74 30.34 58.30 49.69
STMV Production NPT 4fs 68.50 51.73 35.12 70.15 79.81 54.30 37.31 52.02 37.01 33.70 30.87 26.52 20.27 38.65 32.18
TRPCage GB 2fs 1465.03 1453.69 1440.36 1413.28 1492.86 1567.07 1519.47 1040.61 1166.26 1187.99 1235.49 1183.58 1244.75 1225.53 1332.27
Myoglobin GB 2fs 1005.85 859.04 749.58 1094.48 905.33 837.89 757.91 661.22 650.48 595.17 586.42 531.74 492.48 621.73 619.67
Nucleosome GB 2fs 31.75 25.98 19.24 37.68 36.30 27.54 21.34 29.66 20.37 15.71 15.60 11.92 11.02 21.08 17.72

AMBER 22 GPU Benchmark: JAC Production NVE 4fs

JAC Production NVE 4FS AMBER Benchmark

AMBER 22 GPU Benchmark: JAC Production NPT 4fs

JAC Production NPT 4FS AMBER Benchmark

AMBER 22 GPU Benchmark: JAC Production NVE 2fs

JAC Production NVE 2FS AMBER Benchmark

AMBER 22 GPU Benchmark: JAC Production NPT 2fs

JAC Production NPT 2FS AMBER Benchmark

AMBER 22 GPU Benchmark: FactorIX Production NVE 2fs

FactorIX Production NVE 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: FactorIX Production NPT 2fs

FactorIX Production NPT 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Cellulose Production NVE 2fs

Cellulose Production NVE 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Cellulose Production NPT 2fs

Cellulose Production NPT 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: STMV Production NPT 4fs

STMV Production NPT 4fs AMBER Benchmark

AMBER 22 GPU Benchmark: TRPCage GB 2fs

TRPCage GB 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Myoglobin GB 2fs

Myoglobin GB 2fs AMBER Benchmark

AMBER 22 GPU Benchmark: Nucleosome GB 2fs

Nucleosome GB 2fs AMBER Benchmark

Note about AMBER Benchmarks (From Dave Cerutti)

We take as benchmarks four periodic systems spanning a range of system sizes and compositions. The smallest Dihydrofolate Reductase (DHFR) case is a 159-residue protein in water, weighing in at 23,588 atoms. Next, from the human blood clotting system, Factor IX is a 379-residue protein also in a box of water, totaling 90,906 atoms. The larger cellulose system, with 408,609 atoms, has a greater content of macromolecules in it: the repeating sugar polymer constitutes roughly a sixth of the atoms in the system. Finally, the very large simulation of satellite tobacco mosaic virus (STMV), a gargantuan 1,067,095 atom system, also has an appreciable macromolecule content but is otherwise another collection of proteins in water. (source http://ambermd.org/GPUPerformance.php)

What is AMBER Molecular Dynamics Package?

AMBER is a molecular dynamics software package that simulates molecular mechanical force fields. AMBER (Assisted Model Building with Energy Refinement) is a family of force fields for molecular dynamics of biomolecules originally developed by Peter Kollman’s group at the University of California, San Francisco. The AMBER MD software package is maintained by active collaboration between David Case at Rutgers University, Tom Cheatham at the University of Utah, Adrian Roitberg at the University of Florida, Ken Merz at Michigan State University, Carlos Simmerling at Stony Brook University, Ray Luo at UC Irvine, and Junmei Wang at Encysive Pharmaceuticals.


Have any questions?
Contact Exxact Today


Topics