NVIDIA Extends Ampere with RTX A6000 & A40
Today, NVIDIA released two new GPUs utilizing the new Ampere architecture, the NVIDIA RTX A6000 for workstations and NVIDIA A40 for server/datacenter workloads. The new cards feature new RT Cores, new Tensor Cores, and more CUDA cores than previous generations. With 48GB DDR6 (expandable up to 96 when using 2x cards with NVLink) RTX A6000 is the new flagship GPU of the Quadro line, to take the place of the Turing Powered Quadro RTX 8000. The NVIDIA A40 has essentially the same specs, but with passive cooling, and intended for use in server environments.
NVIDIA RTX A6000 vs NVIDIA A40 vs Quadro RTX 8000
NVIDIA RTX A6000 | NVIDIA A40 | Quadro RTX 8000 | |
CUDA Cores | 10752 | 10752 | 4608 |
Tensor Cores | 336 | 336 | 576 |
Memory Clock | 16Gbps GDDR6 | 14.5Gbps GDDR6 | 14Gbps GDDR6 |
Memory Bus Width | 384-bit | 384-bit | 384-bit |
VRAM | 48GB | 48GB | 48GB |
ECC | Partial | Partial | Partial |
(DRAM) | (DRAM) | (DRAM) | |
Half Precision | ? | ? | 32.6 TFLOPS |
Single Precision | ? | ? | 16.3 TFLOPS |
Tensor Performance | ? | ? | 130.5 TFLOPS |
TDP | 300W | 300W | 295W |
Cooling | Active | Passive | Active |
NVLink | 1x NVLink Gen3 | 1x NVLink Gen3 | 1x NVLInk Gen2 |
NVLink Speed | 112.5GB/sec | 112.5GB/sec | 50GB/sec |
GPU | GA102 | GA102 | TU102 |
Architecture | Ampere | Ampere | Turing |
Manufacturing Process | Samsung 8nm | Samsung 8nm | TSMC 12nm FFN |
More About the RTX A6000 & A40
Image of NVIDIA A40 – Source: NVIDIA
NVIDIA Ampere Architecture CUDA Cores
Double-speed processing for single-precision floating point (FP32) operations and improved power efficiency provide significant performance improvements for graphics and simulation workflows, such as complex 3D computer-aided design (CAD) and computer-aided engineering (CAE), on the desktop.
Second-Generation RT Cores
With up to 2X the throughput over the previous generation and the ability to concurrently run ray tracing with either shading or denoising capabilities, second-generation RT Cores deliver massive speedups for workloads like photorealistic rendering of movie content, architectural design evaluations, and virtual prototyping of product designs. This technology also speeds up the rendering of ray-traced motion blur for faster results with greater visual accuracy.
Third-Generation Tensor Cores
New Tensor Float 32 (TF32) precision provides up to 5X the training throughput over the previous generation to accelerate AI and data science model training without requiring any code changes. Hardware support for structural sparsity doubles the throughput for inferencing. Tensor Cores also bring AI to graphics with capabilities like DLSS, AI denoising, and enhanced editing for select applications.
Third-Generation NVIDIA NVLink®
Increased GPU-to-GPU interconnect bandwidth provides a single scalable memory to accelerate graphics and compute workloads and tackle larger datasets.
PCI Express Gen 4
Support for PCI Express Gen 4 provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.
Power Efficiency
Featuring a dual-slot, power efficient design, the RTX A6000 is up to 2X more power efficient than Turing GPUs and crafted to fit into a wide range of workstations from worldwide OEM vendors.
NVIDIA RTX A6000 Specs at a Glance
GPU Memory | 48 GB GDDR6 with error-correcting code (ECC) |
Display Ports | 4x DisplayPort 1.4* |
Max Power Consumption | 300 W |
Graphics Bus | PCI Express Gen 4 x 16 |
Form Factor | 4.4” (H) x 10.5” (L) dual slot |
Thermal | Active |
NVLink | 2-way low profile (2-slot and 3-slot bridges) |
Connect 2 RTX A6000 | |
vGPU Software Support | NVIDIA GRID®, NVIDIA Quadro® Virtual Data Center Workstation, |
NVIDIA Virtual Compute Server | |
vGPU Profiles Supported | 1 GB, 2 GB, 3 GB, 4 GB, 6 GB, 8 GB, 12 GB, 16 GB, 24 GB, 48 GB |
VR Ready | Yes |
NVIDIA A40 Specs at a Glance
GPU Memory | 48 GB GDDR6 with error-correcting code (ECC) |
GPU Memory Bandwidth | 696 GB/s |
Interconnect | NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s |
NVLink | 2-way low profile (2-slot) |
Display Ports | 3x DisplayPort 1.4* |
Max Power Consumption | 300 W |
Form Factor | 4.4″ (H) x 10.5″ (L) Dual Slot |
Thermal | Passive |
vGPU Software Support | NVIDIA GRID®, NVIDIA Quadro® Virtual Data Center Workstation, NVIDIA Virtual Compute Server |
vGPU Profiles Supported | 1 GB, 2 GB, 3 GB, 4 GB, 6 GB, 8 GB, 12 GB, 16 GB, 24 GB, 48 GB |
NVENC | NVDEC | 1x | 2x (includes AV1 decode) |
Secure and Measured Boot with Hardware Root of Trust | CEC 1712 |
NEBS Ready | Level 3 |
Power Connector | 8-pin CPU |
NVIDIA RTX A6000 and NVIDIA A40 Available in New Systems
Exxact will offer NVIDIA A6000 and A40 across a broad spectrum of Deep Learning Workstations & Servers, Data Science Workstations, NVIDIA RTX Server, and other Ampere and NVIDIA Quadro based systems.
Have any questions about the new NVIDIA GPUs, or system availability?
Contact Exxact Today
NVIDIA RTX A6000 and NVIDIA A40 GPUs Released – Here’s What You Should Know
NVIDIA Extends Ampere with RTX A6000 & A40
Today, NVIDIA released two new GPUs utilizing the new Ampere architecture, the NVIDIA RTX A6000 for workstations and NVIDIA A40 for server/datacenter workloads. The new cards feature new RT Cores, new Tensor Cores, and more CUDA cores than previous generations. With 48GB DDR6 (expandable up to 96 when using 2x cards with NVLink) RTX A6000 is the new flagship GPU of the Quadro line, to take the place of the Turing Powered Quadro RTX 8000. The NVIDIA A40 has essentially the same specs, but with passive cooling, and intended for use in server environments.
NVIDIA RTX A6000 vs NVIDIA A40 vs Quadro RTX 8000
NVIDIA RTX A6000 | NVIDIA A40 | Quadro RTX 8000 | |
CUDA Cores | 10752 | 10752 | 4608 |
Tensor Cores | 336 | 336 | 576 |
Memory Clock | 16Gbps GDDR6 | 14.5Gbps GDDR6 | 14Gbps GDDR6 |
Memory Bus Width | 384-bit | 384-bit | 384-bit |
VRAM | 48GB | 48GB | 48GB |
ECC | Partial | Partial | Partial |
(DRAM) | (DRAM) | (DRAM) | |
Half Precision | ? | ? | 32.6 TFLOPS |
Single Precision | ? | ? | 16.3 TFLOPS |
Tensor Performance | ? | ? | 130.5 TFLOPS |
TDP | 300W | 300W | 295W |
Cooling | Active | Passive | Active |
NVLink | 1x NVLink Gen3 | 1x NVLink Gen3 | 1x NVLInk Gen2 |
NVLink Speed | 112.5GB/sec | 112.5GB/sec | 50GB/sec |
GPU | GA102 | GA102 | TU102 |
Architecture | Ampere | Ampere | Turing |
Manufacturing Process | Samsung 8nm | Samsung 8nm | TSMC 12nm FFN |
More About the RTX A6000 & A40
Image of NVIDIA A40 – Source: NVIDIA
NVIDIA Ampere Architecture CUDA Cores
Double-speed processing for single-precision floating point (FP32) operations and improved power efficiency provide significant performance improvements for graphics and simulation workflows, such as complex 3D computer-aided design (CAD) and computer-aided engineering (CAE), on the desktop.
Second-Generation RT Cores
With up to 2X the throughput over the previous generation and the ability to concurrently run ray tracing with either shading or denoising capabilities, second-generation RT Cores deliver massive speedups for workloads like photorealistic rendering of movie content, architectural design evaluations, and virtual prototyping of product designs. This technology also speeds up the rendering of ray-traced motion blur for faster results with greater visual accuracy.
Third-Generation Tensor Cores
New Tensor Float 32 (TF32) precision provides up to 5X the training throughput over the previous generation to accelerate AI and data science model training without requiring any code changes. Hardware support for structural sparsity doubles the throughput for inferencing. Tensor Cores also bring AI to graphics with capabilities like DLSS, AI denoising, and enhanced editing for select applications.
Third-Generation NVIDIA NVLink®
Increased GPU-to-GPU interconnect bandwidth provides a single scalable memory to accelerate graphics and compute workloads and tackle larger datasets.
PCI Express Gen 4
Support for PCI Express Gen 4 provides double the bandwidth of PCIe Gen 3, improving data-transfer speeds from CPU memory for data-intensive tasks like AI and data science.
Power Efficiency
Featuring a dual-slot, power efficient design, the RTX A6000 is up to 2X more power efficient than Turing GPUs and crafted to fit into a wide range of workstations from worldwide OEM vendors.
NVIDIA RTX A6000 Specs at a Glance
GPU Memory | 48 GB GDDR6 with error-correcting code (ECC) |
Display Ports | 4x DisplayPort 1.4* |
Max Power Consumption | 300 W |
Graphics Bus | PCI Express Gen 4 x 16 |
Form Factor | 4.4” (H) x 10.5” (L) dual slot |
Thermal | Active |
NVLink | 2-way low profile (2-slot and 3-slot bridges) |
Connect 2 RTX A6000 | |
vGPU Software Support | NVIDIA GRID®, NVIDIA Quadro® Virtual Data Center Workstation, |
NVIDIA Virtual Compute Server | |
vGPU Profiles Supported | 1 GB, 2 GB, 3 GB, 4 GB, 6 GB, 8 GB, 12 GB, 16 GB, 24 GB, 48 GB |
VR Ready | Yes |
NVIDIA A40 Specs at a Glance
GPU Memory | 48 GB GDDR6 with error-correcting code (ECC) |
GPU Memory Bandwidth | 696 GB/s |
Interconnect | NVIDIA NVLink 112.5 GB/s (bidirectional) PCIe Gen4 16 GB/s |
NVLink | 2-way low profile (2-slot) |
Display Ports | 3x DisplayPort 1.4* |
Max Power Consumption | 300 W |
Form Factor | 4.4″ (H) x 10.5″ (L) Dual Slot |
Thermal | Passive |
vGPU Software Support | NVIDIA GRID®, NVIDIA Quadro® Virtual Data Center Workstation, NVIDIA Virtual Compute Server |
vGPU Profiles Supported | 1 GB, 2 GB, 3 GB, 4 GB, 6 GB, 8 GB, 12 GB, 16 GB, 24 GB, 48 GB |
NVENC | NVDEC | 1x | 2x (includes AV1 decode) |
Secure and Measured Boot with Hardware Root of Trust | CEC 1712 |
NEBS Ready | Level 3 |
Power Connector | 8-pin CPU |
NVIDIA RTX A6000 and NVIDIA A40 Available in New Systems
Exxact will offer NVIDIA A6000 and A40 across a broad spectrum of Deep Learning Workstations & Servers, Data Science Workstations, NVIDIA RTX Server, and other Ampere and NVIDIA Quadro based systems.
Have any questions about the new NVIDIA GPUs, or system availability?
Contact Exxact Today