In this blog, we examine Benchmark results for Ansys Mechanical on NVIDIA GPUs. According to Ansys: "Ansys benchmarks provide comprehensive and fair comparative information concerning the performance of Ansys solvers on available hardware platforms. The benchmarks can be used to compare the performance of different hardware platforms when running Ansys solvers. To accomplish this goal, Ansys provides the suite of simulation cases that make up the Ansys benchmark suite to its hardware partners for their use in benchmarking hardware performance; then the data is reported to Ansys and compiled for this site. Cases are selected to represent typical usage and cover a range of mesh sizes and physical models."
Key Findings
- Tesla V100S has nearly 2x better performance on models (V19sp-X) using sparse solver due to its higher FP64 performance.
- V100S performance is almost 2x better than RTX 6000/8000 in some cases (V19sp-X cases).
- Besides GPU, high core count CPU is playing an essential role in Ansys application performance. In some cases, a sole high core count processor even outperforms a solution with CPU + GPU acceleration. e.g. 2x Gold 6252 24C vs 2x 6256 12C + V100S.
System Specifications
Exxact Ansys Certified Rack-mountable Workstation - CPU: 2x Intel Xeon Gold 6256 3.60 GHz 12-Core CPUs, HT Off. GPU: NVIDIA Quadro V100S*, NVIDIA Quadro RTX 6000 Mem: 512GB RAM. Cent OS 7.7.1908 64-bit.
*NVIDIA Tesla V100S is passively cooled and recommended only if used in a rack-mounted configuration for this type of workstation.
Ansys 2019 R2 Test Cases
- Power Supply Module (V19cg-1)
- Tractor Rear Axle (V19cg-2)
- Engine Block (V19cg-3)
- Gear Box (V19ln-1)
- Radial Impeller (V19ln-2)
- Peltier Cooling Block (V19sp-1)
- Semi-Submersible (V19sp-2)
- Speaker (V19sp-3)
- Turbine (V19sp-4)
- BGA (V19sp-5)
Notes
The benchmarks have been designed to run in about 48 hours on current generation hardware with at least 128 GB of RAM. If changes are made when running these benchmarks, for example, to run more variations of cores, if the benchmarks are run on older/slower hardware, or if hardware with less than 128GB of physical memory is used, then longer runtimes can be expected
Ansys Mechanical Run Time Benchmark: NVIDIA Tesla V100S vs Quadro RTX 6000 vs CPU Only
Cases | Tesla V100S | Quadro RTX 6000 | CPU Only |
---|---|---|---|
V19cg-1 | 158 | 187 | 406 |
V19cg-2 | 181 | 224 | 323 |
V19cg-3 | 200 | 232 | 394 |
V19ln-1 | 158 | 202 | 298 |
V19ln-2 | 123 | 262 | 412 |
V19sp-1 | 92 | 177 | 274 |
V19sp-2 | 181 | 297 | 508 |
V19sp-3 | 117 | 205 | 309 |
V19sp-4 | 72 | 152 | 244 |
V19sp-5 | 98 | 205 | 322 |
Ansys Benchmarks - Mechanical Core Solver Rating: NVIDIA Tesla V100S vs Quadro RTX 6000 vs CPU Only
Tesla V100S | Quadro RTX 6000 | CPU Only | |
---|---|---|---|
V19cg-1 | 902.82 | 734.07 | 273.24 |
V19cg-2 | 751.30 | 568.05 | 360.90 |
V19cg-3 | 674.47 | 544.42 | 283.00 |
V19ln-1 | 734.07 | 529.74 | 343.27 |
V19ln-2 | 799.26 | 350.37 | 216.22 |
V19sp-1 | 1102.04 | 528.44 | 330.28 |
V19sp-2 | 541.35 | 313.16 | 176.98 |
V19sp-3 | 64.36 | 24.07 | 14.68 |
V19sp-4 | 58.78 | 22.45 | 13.07 |
V19sp-5 | 31.91 | 12.64 | 7.58 |
Power Supply Module (V19cg-1)
Analysis Type | Steady State Thermal |
---|---|
Number of Degrees of Freedom | 5,300,000 |
Equation Solver | JCG |
Matrix | Symmetric |
Tractor Rear Axle (V19cg-2)
Analysis Type | Steady Linear Thermal |
---|---|
Number of Degrees of Freedom | 12,300,000 |
Equation Solver | PCG |
Matrix | Symmetric |
Engine Block (V19cg-3)
Analysis Type | Static Linear Structural |
---|---|
Number of Degrees of Freedom | 14,200,000 |
Equation Solver | PCG |
Matrix | Symmetric |
Gear Box (V19ln-1)
Analysis Type | Modal Structural |
---|---|
Number of Degrees of Freedom | 7,700,000 |
Equation Solver | PCG |
Matrix | Symmetric |
Radial Impeller (V19ln-2)
Analysis Type | Modal - Cyclic Symmetry, Structural |
---|---|
Number of Degrees of Freedom | 2,000,000 |
Equation Solver | Block Lanczos |
Matrix | Symmetric |
Peltier Cooling Block (V19sp-1)
Analysis Type | Static Nonlinear Thermal-Electric Coupled Field |
---|---|
Number of Degrees of Freedom | 650,000 |
Equation Solver | Sparse |
Matrix | Non-symmetric (Unsymmetric) |
Semi-Submersible (V19sp-2)
Analysis Type | Transient Nonlinear Structural |
---|---|
Number of Degrees of Freedom | 4,700,000 |
Equation Solver | Sparse |
Matrix | Non-symmetric (Unsymmetric) |
Speaker (V19sp-3)
Analysis Type | Harmonic Linear Structural |
---|---|
Number of Degrees of Freedom | 1,700,000 |
Equation Solver | Sparse |
Matrix | Non-symmetric (Unsymmetric) |
Turbine (V19sp-4)
Analysis Type | Static Nonlinear Structural |
---|---|
Number of Degrees of Freedom | 3,200,000 |
Equation Solver | Sparse |
Matrix | Symmetric |
BGA (V19sp-5)
Analysis Type | Static Nonlinear Structural |
---|---|
Number of Degrees of Freedom | 6,000,000 |
Equation Solver | Sparse |
Matrix | Symmetric |
You May Also Be Interested In
Ansys Mechanical Benchmarks Comparing GPU Performance of NVIDIA RTX 6000 vs Tesla V100S vs CPU Only
In this blog, we examine Benchmark results for Ansys Mechanical on NVIDIA GPUs. According to Ansys: "Ansys benchmarks provide comprehensive and fair comparative information concerning the performance of Ansys solvers on available hardware platforms. The benchmarks can be used to compare the performance of different hardware platforms when running Ansys solvers. To accomplish this goal, Ansys provides the suite of simulation cases that make up the Ansys benchmark suite to its hardware partners for their use in benchmarking hardware performance; then the data is reported to Ansys and compiled for this site. Cases are selected to represent typical usage and cover a range of mesh sizes and physical models."
Key Findings
- Tesla V100S has nearly 2x better performance on models (V19sp-X) using sparse solver due to its higher FP64 performance.
- V100S performance is almost 2x better than RTX 6000/8000 in some cases (V19sp-X cases).
- Besides GPU, high core count CPU is playing an essential role in Ansys application performance. In some cases, a sole high core count processor even outperforms a solution with CPU + GPU acceleration. e.g. 2x Gold 6252 24C vs 2x 6256 12C + V100S.
System Specifications
Exxact Ansys Certified Rack-mountable Workstation - CPU: 2x Intel Xeon Gold 6256 3.60 GHz 12-Core CPUs, HT Off. GPU: NVIDIA Quadro V100S*, NVIDIA Quadro RTX 6000 Mem: 512GB RAM. Cent OS 7.7.1908 64-bit.
*NVIDIA Tesla V100S is passively cooled and recommended only if used in a rack-mounted configuration for this type of workstation.
Ansys 2019 R2 Test Cases
- Power Supply Module (V19cg-1)
- Tractor Rear Axle (V19cg-2)
- Engine Block (V19cg-3)
- Gear Box (V19ln-1)
- Radial Impeller (V19ln-2)
- Peltier Cooling Block (V19sp-1)
- Semi-Submersible (V19sp-2)
- Speaker (V19sp-3)
- Turbine (V19sp-4)
- BGA (V19sp-5)
Notes
The benchmarks have been designed to run in about 48 hours on current generation hardware with at least 128 GB of RAM. If changes are made when running these benchmarks, for example, to run more variations of cores, if the benchmarks are run on older/slower hardware, or if hardware with less than 128GB of physical memory is used, then longer runtimes can be expected
Ansys Mechanical Run Time Benchmark: NVIDIA Tesla V100S vs Quadro RTX 6000 vs CPU Only
Cases | Tesla V100S | Quadro RTX 6000 | CPU Only |
---|---|---|---|
V19cg-1 | 158 | 187 | 406 |
V19cg-2 | 181 | 224 | 323 |
V19cg-3 | 200 | 232 | 394 |
V19ln-1 | 158 | 202 | 298 |
V19ln-2 | 123 | 262 | 412 |
V19sp-1 | 92 | 177 | 274 |
V19sp-2 | 181 | 297 | 508 |
V19sp-3 | 117 | 205 | 309 |
V19sp-4 | 72 | 152 | 244 |
V19sp-5 | 98 | 205 | 322 |
Ansys Benchmarks - Mechanical Core Solver Rating: NVIDIA Tesla V100S vs Quadro RTX 6000 vs CPU Only
Tesla V100S | Quadro RTX 6000 | CPU Only | |
---|---|---|---|
V19cg-1 | 902.82 | 734.07 | 273.24 |
V19cg-2 | 751.30 | 568.05 | 360.90 |
V19cg-3 | 674.47 | 544.42 | 283.00 |
V19ln-1 | 734.07 | 529.74 | 343.27 |
V19ln-2 | 799.26 | 350.37 | 216.22 |
V19sp-1 | 1102.04 | 528.44 | 330.28 |
V19sp-2 | 541.35 | 313.16 | 176.98 |
V19sp-3 | 64.36 | 24.07 | 14.68 |
V19sp-4 | 58.78 | 22.45 | 13.07 |
V19sp-5 | 31.91 | 12.64 | 7.58 |
Power Supply Module (V19cg-1)
Analysis Type | Steady State Thermal |
---|---|
Number of Degrees of Freedom | 5,300,000 |
Equation Solver | JCG |
Matrix | Symmetric |
Tractor Rear Axle (V19cg-2)
Analysis Type | Steady Linear Thermal |
---|---|
Number of Degrees of Freedom | 12,300,000 |
Equation Solver | PCG |
Matrix | Symmetric |
Engine Block (V19cg-3)
Analysis Type | Static Linear Structural |
---|---|
Number of Degrees of Freedom | 14,200,000 |
Equation Solver | PCG |
Matrix | Symmetric |
Gear Box (V19ln-1)
Analysis Type | Modal Structural |
---|---|
Number of Degrees of Freedom | 7,700,000 |
Equation Solver | PCG |
Matrix | Symmetric |
Radial Impeller (V19ln-2)
Analysis Type | Modal - Cyclic Symmetry, Structural |
---|---|
Number of Degrees of Freedom | 2,000,000 |
Equation Solver | Block Lanczos |
Matrix | Symmetric |
Peltier Cooling Block (V19sp-1)
Analysis Type | Static Nonlinear Thermal-Electric Coupled Field |
---|---|
Number of Degrees of Freedom | 650,000 |
Equation Solver | Sparse |
Matrix | Non-symmetric (Unsymmetric) |
Semi-Submersible (V19sp-2)
Analysis Type | Transient Nonlinear Structural |
---|---|
Number of Degrees of Freedom | 4,700,000 |
Equation Solver | Sparse |
Matrix | Non-symmetric (Unsymmetric) |
Speaker (V19sp-3)
Analysis Type | Harmonic Linear Structural |
---|---|
Number of Degrees of Freedom | 1,700,000 |
Equation Solver | Sparse |
Matrix | Non-symmetric (Unsymmetric) |
Turbine (V19sp-4)
Analysis Type | Static Nonlinear Structural |
---|---|
Number of Degrees of Freedom | 3,200,000 |
Equation Solver | Sparse |
Matrix | Symmetric |
BGA (V19sp-5)
Analysis Type | Static Nonlinear Structural |
---|---|
Number of Degrees of Freedom | 6,000,000 |
Equation Solver | Sparse |
Matrix | Symmetric |