Web Hosting Reviews
HA technologies :: The Cluster network organzation

Other Articles:
 Cluster. Begin construction.

24/7 Solutions - The Cluster network organzation.


The network is a modular and adapted switching system which can be adjusted according to the most various requirements. Module networks are facilitated with addition of new components or moving existing. Adaptibility of a network simplifies modification. The network cluster Beowulf differs nothing from a network of workstations, therefore in the most simple case for construction of cluster usual network cards and switches are necessary. However, in case of cluster there is one feature. The cluster network first of all is intended not for communication of servers, and for communication of computing processes. Therefore the above there will be a throughput of your network, the the parallel problems started on cluster will be considered more quickly. From this follows, that throughput of a network gets paramount value.

For construction of cluster use the diversified network equipment. Characteristics of standard network devices noticeably concede to characteristics of specialized communications in MPP servers. Therefore throughput of the network connecting units of cluster, in many cases appears a determinative for productivity of cluster. The used network equipment characterize usually in two parameters:

Throughput.

It is speed of data transmission between two units after communication is established. The manufacturer usually declares peak throughput, which in 1.5-2 times above real.

Latency.

This average time between a call of function of data transmission and the transfer. Time is spent for addressing of the information, operation of intermediate network devices, the other overhead charge arising at data transmission.

Let's result for comparison parameters of some most popular network devices.

The network equipment

Peak throughput

Latency

1. Fast Ethernet

12.5 Mbyte/sec

115 usec

2. Gigabit Ethernet

125 Mbyte/sec

115 usec

3. Myrinet

160 Mbyte/sec

15 usec

4. SCI

400 Mbyte/sec (100 for real)

2.3 usec

5. cLAN

150 Mbyte/sec

30 usec

Actually throughput and latency not only characterize cluster, but also limit a class of problems which can effectively be solved on it. If the problem demands frequent data transmission, cluster, using the network equipment with big latency (for example, Gigabit Ethernet), the most part of time will spend not for data transmission between processes, and on an establishment of communication. At this time units will stand idle, and we shall not receive substantial growth of productivity. However, if great volumes of data are sent, influence of the latency period on efficiency of cluster can decrease because transfer will demand enough big time.

For lowcost clusters use of fast technologies Myrinet, SCI, cLAN most likely can appear impossible from the financial point of view. Therefore we shall consider cheaper decisions. Use for cluster 10Mbit-networks though and is possible, but silly. As a result you risk to receive from use of cluster more disappointments, than real increase in efficiency and reliability of your system. We shall consider the equipment for networks from 100Mbit and above.

Network cards. As network adapters it is possible to use any cards available on sale supporting work in standards 100BaseTx and Gigabit Ethernet. As to the list of preferences it is possible to recommend first of all 3Com. Among other variants it is possible to name Compex, Intel, Macronix. At construction of clusters payments on the basis of chipsets Intel 21142/21143 are especially popular. Popularity of these cards is caused by opinion on their high efficiency while their price in comparison with competing offers is insignificant usually enough. As to network cards of 3Com have some advantages noticeably influencing productivity of network communications. We shall result some examples of opportunities of hardware maintenance of cards 3Com:

  • Unloading the processor at calculation of control sums TCP/UDP/IP. Releases the central processor from intensive calculations of the control sums, carrying out them in the most network payment. That productivity of system and time of the processor raises life.
  • Clearing the CPU at restoration of segmented packages TCP. Reduces loading on the central processor, raising productivity of system.
  • Association of interruptions. Allows to group some the received packages. Optimizes computing efficiency server, reducing number of interruptions and as much as possible releasing processor resources for work of applications.
  • Mode Bus mastering DMA. Provides more effective data exchange for decrease in loading of the central processor.

In any case if you do not assume to use technology channel bonding which allows to unite some network adapters in one high-speed virtual channel you can choose any card for purchase. Practically all the modern network cards which are available now on sale, without problems are distinguished Linux and normally work.

For the organization of channel bonding is better to choose network cards Intel EtherExpress PRO/100, 3Com FastEthernet (for example, 3c905B, 3c905C) or any cards Gigabit Ethernet from 3Com or Intel. As an interesting variant are specialized server network cards in which there are some Ethernet-ports. For example: Intel EtherExpress PRO/1000 MF Dual Port or 3Com Fast EtherLink Server Dual Port 3c982C. Use of such cards will allow to borrow in a computer twice less PCI-slots and, accordingly, to establish twice more than network cards for their association in the connected channel.

Switches. The second important element of a cluster network are devices of network channels switching. At a choice of switching devices as it is necessary to consider an opportunity of use channel bonding. Depending on, whether the technology of channels linkage will be used at cluster construction, it is possible to stop the choice on the various network equipment.

Switches and other elements of network structure are used for maintenance of communications between processors, for support of administration various functions. For the organization of interprocessor interaction (Inter Process Communication, IPC) switchboard Myrinet-2000 by Myricom company - very fast, well scaled broadband device is widely used.

At increase in number of cluster units the general width of a passband - grows proportionally, and latency remains to a constant. Differently, the strip on each of ways is identical, and the number of ways depends on quantity of units, thus each unit has communication with all other units irrespective of the cluster size. The strip, counting upon a direction, can make 200 Mb per second in each direction with latency in 6-8 usec. Communications between the user spaces can be realized on the basis of protocols IP or GM by means of ON user level Myricom.

If the environment of parallel computing does not demand the raised intensity of communications between processors less expensive means, say, Ethernet can be used. In the individual custom-made project technologies GigaNet, SCI or ServerNet, and in the future and InfiniBand can be applied also.

The choice of the switch is carried out first of all on the basis of its characteristics. In the most simple case for construction of a cluster network it is possible to use simple hubs. This decision most favourable at the price of, is the most unsuccessful in technological sense. At use hubs there is no routing packages of transferred data. Any package transferred in a network, goes absolutely to all participants of a network. Each PC "hears" all packages of data transferred in a network, without dependence from, whether the concrete package is intended for it or not. At an active interprocessor exchange it can lead an overload of a network, increase in number of collisions and to decrease in effective speed of the parallel server. For the decision of this problem it is necessary to use the active network equipment - switches which allow to establish some kind of liaison channels between pairs servers.

If to speak, for example, about 100Mbit networks a problem of the switch is maintenance of throughput 100 Mbit/sec simultaneously for all n/2 connections between pairs ports of the n-port switch. Theoretically the switch should guarantee it, but in practice manufacturers of the equipment rather often go on simplification of an electronic stuffing of production, both with the purpose of reduction in price, and with the purpose of the maximal increase in number of ports. In the latter case at paralizing there can be conflicts at level Fast Ethernet that reduces speed of an exchange of messages and accordingly efficiency paralleling.

By our personal experience the table of priorities at a choice network switch for construction of a cluster network can look so: Cisco Catalyst, 3Com SuperStack 3, Compex Switch. On last place cost cheap hubs various manufacturers, such as Compex or 3Com.

Certainly, making a decision on a choice of the switch, it is necessary to consider and their other characteristics, including the price. Good production also costs more dearly. So, excellent switches Cisco Catalyst (for example, the known model 5000 having greater number of ports and supporting an opportunity of linkage of channels) have higher price, than the equipment not so famos firms.

Not all switches can provide an opportunity of application channel bonding. If you assume to use channel bonding for increase in throughput of your network it is necessary to approach with special carefulness to a choice of the switch. Hubs in this case to apply it is impossible. The problem consists in linkage of channels that at presence channel bonding at you appears two or several network cards with the identical MAC-address. In a usual operating mode the switch will shutdown. Or will intensively reconstruct the internal tables of ports, renominating your MAC-address from one port on another. It can lead full shutdown the channel.

For maintenance of normal work channel bonding in the switch functions Link Aggregation or work in standard IEEE 802.3ad should be stipulated. At purchase of the switch closely read applied specifications and search for these magic word-combinations. Not all the switches having function Link Aggregation, allow to apply it to all ports. There are models which have 12/24 100Mbit and two Gigabit port. In such models Link Aggregation it is possible to adjust only for gigabit ports, using them for communication between two switches. Clearly, that such models not used for our purposes. Therefore consultations of experts at purchase of the switch are obligatory.

The switch, allowing to adjust Link Aggregation, it is possible to mention Cisco Catalyst 2900 series, Cisco Catalyst 3500 series, Cisco Catalyst 5000 series, 3Com SuperStack 3 4950, 4400, etc. it is necessary to note, that presence or absence of functions Link Aggregation depends not only on model of the switch, but also from the version of its software.