Web Hosting Reviews
HA technologies :: Why cluster technologies?

Other Articles:
 Cluster servers architecture in hosting solutions

24/7 Solutions - Why cluster technologies?

There is a certain class of the problems demanding productivity by higher, than it is possible to receive, using usual computers or servers. In these cases from several powerful systems create cluster, allowing to carry calculations not only on different processors (if multiprocessing SMP-systems are used), but also on different computers. For the problems, allowing good paralleling and not showing high requirements on interaction of parallel streams, often make a decision on creation HPC cluster from the big number of low-power uniprocessor systems. Frequently similar decisions, at low cost, allow to reach much greater productivity, than productivity of supercomputers.

Creation such cluster demands the certain knowledge, and its use entails cardinal change of a used paradigm of programming that is psychologically enough difficult. You can be the professional in a writing consecutive when, but it will not rescue you from necessity of studying of methods of parallel programming.

Often there is an error, that only use of a supercomputer can in itself give a gain of productivity. It is not true. If your problem has no internal parallelism and is not adapted in appropriate way, a maximum, that you can receive from cluster is a start on performance of several copies of the program simultaneously, working with various initial data. It will not accelerate performance of one concrete program, but will allow to save a lot of time if it is necessary to count set of variants for limited time.

If volumes of your problem are those, that only one run by the uniprocessor PC can last day, weeks and months it is necessary to make efforts on adaptation of algorithm. It is necessary to divide a problem on some (on number of processors) fineer subtasks which can be carried out independently, and in those places where independent performance is impossible, obviously to cause procedure of synchronization, for data exchange through a network. For example, if you process the big data file it will be reasonable to divide it into areas and to distribute them on processors, having provided uniform loading of all cluster.

Therefore before to pass to practical realization of cluster technologies it is necessary to solve for itself some questions of principle.

First of them sounds so: "It is necessary for the decision of my problems cluster and parallel calculations?" To answer this question it is necessary to look narrowly at problems solved by you closely. Parallel calculations - specific enough area of mathematics and be far not always parallel calculations can you use. Cluster most likely it is not necessary, if:

  • You use specialized software packages which are not adapted for parallel calculation in MPI environments and PVM or not intended for work in UNIX. In this case you simply not can involve more than one processor for performance of a problem or in general start your program in an another's operational environment.
  • The Programs written by you for the decision of your problems, demand no more than several hours of processor time on an available equipment. Can happen so, that time spent by you on paralleling and debugging of your problem will be eaten with all advantage in speed which will be given with multiprocessing processing.
  • Life Time of a your program = its development in a parallel variant.

The second question which you should solve, this presence of a basic opportunity "paralleling" your problem. Some numerical schemes by virtue of features of algorithm do not give in effective paralleling. Before to be guided by application cluster for the decision of your problem, it is necessary to make sure of an opportunity of application by you of parallel algorithms.

The application should be divided into the parts, capable to be executed in parallel on several processors, and is divided effectively that separately executed parts of the program did not influence execution of other parts.

From here the first conclusion - before thoroughly to plough up a code for transition to a parallel computer it is necessary to think thoroughly, and whether costs it development? If, having estimated you have understood the algorithm incorporated in the program, that the share of consecutive operations is great, it is not necessary to count on significant acceleration obviously and necessary to think of replacement separate a component of algorithm.