The Benefits of Multi-Processor Systems
Capable of up to 6TB of system memory, 4-Way and 8-Way (multi-processor) systems perform better and cost less due to an increased number of internal processors. Additionally, multi-processor systems limit server sprawl by consolidating workloads and requiring fewer server hardware purchases. Fewer servers to power and cool equates to better energy efficiency, with the bonus of less cabling and network equipment.
To fully maximize the large memory footprints and capabilities of multi-processor systems, it is important to ensure maximum and consistent bandwidth via memory traffic optimization. This can be achieved by using Non-Uniform Memory Access (NUMA) architecture tools and shared memory multi-process programming applications, including OpenMP.
Since processors operate much faster than the memory they use, they can extract data from memory at a faster rate when utilizing NUMA architecture. Without NUMA, it is possible for a system to provide inadequate bandwidth to multiple processors. This is mainly because only one processor can access the system's memory at a time. Access to RAM (random access memory, fetched via a bus) is also non-uniform, whereas memory that is local to the CPU possesses lower latency. NUMA addresses this bottleneck by limiting the number of memory accesses, essential for obtaining high performance from the system. It also enables a separate memory for each processor, preventing performance setbacks during situations where several processors attempt to utilize the same memory.
NUMA greatly influences memory access performance so certain software optimizations are necessary to enable scheduling threads and processes close to their in-memory data. Microsoft Windows 7 and Windows Server 2008 R2 have added NUMA architecture support for over 64 logical cores. Java 7 has added NUMA-aware memory allocators and automatic memory management capabilities. Linux kernel version 2.5 now contains basic NUMA support with further improvements in subsequent kernel releases. OpenSolaris also uses NUMA architecture as a model with latency groups.
Shared Memory Multi-Process Programming - OpenMP
OpenMP provides an implementation of multithreading: taking a master thread and splitting it into a specified number of slave threads. These master threads are run in parallel with the threads allocated to different processors by the runtime environment. Thread creation, workload distribution, data-environment management, thread synchronization, user-level runtime routines and environment variables are the main elements of OpenMP. A combined C/C++/Fortran specification is available.
Applications of Multi-Processor Systems
The number of applications that can benefit from multi-processor systems has been increasing now more than ever with organizations moving towards virtualization in order to bring down costs and increase cloud computing efficiency. Multi-processor systems have better VM consolidation and can handle larger virtual machines (VMs) than 2-Way options. The larger memory capacities enable more VMs per watt and scale more consistently. These systems are an ideal solution for data centers often limited by power and cooling costs. Also, software products like Oracle Database, SAP HANA, SAP Applications, and Microsoft SQL Server benefit from scale-up, multi-processor servers by simplifying server management while reducing server licensing and IT staff time/management costs.
Applications that were not designed and programmed for use on multi-processor systems may not benefit from the increased number of system memory and processors. In some cases, there may be issues with these applications, therefore thread or processor affinity will need to be used. It is important to ensure that the applications in use will take full advantage of the capabilities of these systems. Please contact your software provider to confirm if there are benefits to using multi-processor systems.
|Exxact Multi-Processor System|