Web Hosting Reviews
System Architecture :: Multiprocessing server architecture.

Other Articles:
  Parallel data processing classification of system architecture.

24/7 Solutions - Multiprocessing server architecture


SMP server architecture

SMP - symmetric multiprocessing architecture. The main feature of systems with architecture SMP is presence of the general physical memory divided by all processors.

Schematic of SMP-architecture

Schematic of SMP-architecture.

Memory is way of message transfer between processors. All computers at the reference to it have the equal rights and the same addressing for all cells of memory. Therefore SMP architecture by name symmetric. The SMP-system is under construction on the basis of the high-speed system trunk (SGI PowerPath, Sun Gigaplane, DEC TurboLaser). To slots the bus functional blocks of three types are connected: the central processor, operational system and input-output subsystem. For connection to modules input-output slower trunks (PCI, VME64) are used already. The most known SMP-systems are a SMP-servers and workstations on the basis of processors Intel (IBM, HP, Compaq, Dell, ALR, Unisys, DG, Fujitsu). All system works under control of one OS (for example - UNIX SMP server or FreeBSD SMP server).

Advantages of SMP-architecture:

  • Simplicity and universality of programming. Architecture SMP does not impose restrictions on the model of programming used at creation of the appendix. The model of parallel branches is usually used. When all processors work absolutely independently from each other - however, it is possible to realize and the models using an interprocessor exchange. Use of the general memory increases speed of such exchange, the user also has access at once to all memory size. For SMP-systems there are rather effective means automatic deparalleling.
  • Ease in operation. As a rule, SMP-systems use the system of air-conditioning cooling based that facilitates their maintenance service.
  • Rather low price.

Disadvantages:

Systems with the general memory, constructed on the system trunk, are badly scaled. This important lack of SMP-system does not allow to consider their rather perspective. Besides the system trunk has limited (though also high) throughput and limited number slots. All this with evidence interferes with increase in productivity at increase in number of processors and numbers of connected users. In real systems it is possible to use no more than 32 processors. For construction of scaled systems on the basis of SMP are used clustered or NUMA-architecture. At work with SMP systems use so-called shared memory paradigm.

MPP server architecture

MPP - massive parallel processing architecture. The main feature of such architecture consists that memory is physically divided. In this case the system is under construction of the separate modules containing the processor, local bank of operating memory (RAM), two communication processors (routers) or the network adapter, sometimes - hard disks and-or other devices of input/output. The router One is used for transfer of commands, another - for data transmission. As a matter of fact, such modules represent full-function computers. Access to bank RAM from the given module have only the central processor from the same module. Modules incorporate special communication channels. The user can define logic number of the processor to which it is connected, and to organize an exchange of messages with other processors. Two variants of work of operational system by servers of MPP-architecture are used. In one high-grade operational system works only by the operating server (the forward end), on each separate module strongly cut down variant of OS which were ensuring the functioning only a branch located in it of the parallel appendix works. In the second variant on each module the high-grade UNIX-like OS (Linux, FreeBSD) established separately on each module works.

Schematic of architecture with separate memory

Schematic of architecture with separate memory

Main advantage:

The main advantage of systems with separate memory is good scalability: unlike SMP-systems in machines with separate memory each processor has access only to the local memory in this connection there is no necessity in потактовой synchronization of processors. Practically all records on productivity for today are established by the machines of such architecture consisting of several thousand of processors (ASCI Red, ASCI Blue Pacific).

Disadvantages:

  • Absence of the general memory noticeably reduces speed of an interprocessor exchange as there is no general environment for a data storage, intended for an exchange between processors. The special technics of programming for realization of an exchange by messages between processors Is required.
  • Each processor can use only the limited volume of local memory bank.
  • Owing to the specified architectural lacks significant efforts as much as possible to use system resources are required.
  • It defines the high price of the software for massive parallel systems with separate memory.

Systems with separate memory are supercomputers: MBC-1000, IBM RS/6000 SP, SGI/CRAY T3E, ASCI systems.

Servers of last series CRAY T3E from SGI, based on the basis of processors the Dec Alpha 21164 with peak productivity 1200 Mflps/sec (CRAY T3E-1200), are capable to be scaled up to 2048 processors.

At work with MPP systems use so-called Massive Passing Programming Paradigm (MPI, PVM, BSPlib).

NUMA - hybrid server architecture

NUMA - nonuniform memory access architecture. The main feature of such architecture - non-uniform access to memory.

The hybrid architecture personifies convenience of systems with the general memory and relative cheapness of systems with separate memory. An essence of this architecture - in the special organization of memory: memory is system physically distributed by various parts, but logically divided so the user sees uniform address space. The system consists of the homogeneous base modules (payments) consisting of a small number of processors and the block of memory. Modules are incorporated by means of the high-speed switches. The uniform address space is supported, is hardware access to the removed memory, i.e. to memory of other modules is supported. Thus access to local memory is carried out in some times more quickly, than to removed. In essence architecture НАМА is MPP (massive parallel architecture) architecture, where as separate computing elements undertake SMP units.

The block diagram of a computer with a hybrid network, units are connected by a network of type the Butterfly:

NUMA - hybrid server architecture

For the first time the idea of hybrid architecture was offered by Steve Voloh. It has embodied it in systems of a series the sample. The Voloh's variant - the system consisting from 8 SMP of units. Firm HP has bought idea and realized on supercomputers of series SPP. Idea has picked up Seymour R.Cray and has added a new element - a coherent cache, having created so-called architecture cc-NUMA (the Hiding place Consecutive Non-uniform Access of Memory) which is deciphered as "non-uniform access to memory with maintenance coherent caches". It it realized the Origin on systems of type.

The organization of multilevel hierarchical memory.

The concept coherent cashes describes that fact, that all the central processors receive identical values of the same variables at any moment. Really, as the cache-memory belongs to a separate computer, instead of all multiprocessing system as a whole, the data getting in a cache of one computer, can be inaccessible to another. To avoid it, it is necessary to lead synchronization of the information stored in a cache-memory of processors.

For maintenance similar coherent cashes there are some opportunities:

To use the mechanism of tracking bus inquiries (unduly curious bus report) in which caches trace the variables transferred to any the central processors and, if necessary, modify own copies of such variables. To allocate the special part of memory which are responsible for tracking of reliability of all used spears of variables.

The most known systems of architecture cc-NUMA are: HP 9000 V-classes in SCA-configurations, SGI Origin3000, Sun HPC 15000, IBM/Sequent NUMA-Q 2000. For the present moment the maximal number of processors in cc-NUMA-systems can exceed 1000 (series Origin3000). Usually all system works under control of uniform operating system, as in SMP - UNIX. Variants of system dynamic "division" when separate "sections" of system work under control of different OS are possible also. At work as NUMA-systems, also as with SMP, use a so-called paradigm of programming with the general memory (a paradigm of shared memory).

PVP server architecture

PVP - are parallel to architecture of Process of the Vector. The basic attribute of PVP-systems is presence of special vector-conveyor processors in which commands of the same processing of vectors of the independent data are stipulated, effectively carried out on conveyor functional devices. As a rule, some such processors (1-16) work simultaneously with the general memory (similarly SMP) within the limits of multiprocessing configurations. Some such units can be incorporated by means of the switchboard (similarly MPP). As data transmission in a vector format is carried out much more quickly, than in scalar (the maximal speed can make 64 ui/with, that on 2 orders is faster, than in scalar machines) the problem of interaction between dataflows at deparalleling becomes insignificant. And that is bad deparalleling by scalar machines, is good deparalleling on vector. Thus, systems PVP of architecture can be servers of a general purpose (the general systems of the purpose). However, as vector processors are rather dear, these servers will not be popular.

Servers of PVP architecture are most popular:

1. CRAY SV-2, SMP server architecture. Peak productivity of system in a standard configuration can make tens teraflops.

2. NEC SX-6, NUMA server architecture. Peak productivity of system can reach 8 Tflops, productivity of 1 processor makes 8 Gflops. The system is scaled up to 128 units.

3. Fujitsu-VPP5000 (vector processing of a parallel), MPP server architecture. Productivity of 1 processor makes 9.6 Gflops, peak productivity of system can reach 1249 Gflops, the maximal capacity of memory - 8 Tb. The system is scaled up to 512 units.

The paradigm of programming on PVP systems provides a vectoring of cycles (for achievement of reasonable productivity of one processor) and them deparalleling (for simultaneous loading several processors by one appendix).

Due to the big physical memory (a share of a terabyte), even it is bad vectoring problems on PVP systems are solved more quickly, on systems with scalar processors.