Web Hosting Reviews
Server operating systems :: Fault tolerance operating systems.

Other Articles:
 History of UNIX operating systems family.

24/7 Solutions - Fault tolerance operating systems


To use the fault tolerance equipment without corresponding fault tolerance operating system is all the same what to fly by the plane without wings. Differently, neither the hardware, nor operating system are not self-sufficient concerning maintenance of the computing system fault tolerance.

So has developed historically, that the equipment and operating systems were always developed in close interaction. Sometimes architecture of operating system influenced design of the hardware, but usually operational system was developed so that to approach to already existing equipment.

Modern operating systems are under construction by a principle of layers or levels. At the lowermost level there are hardware-dependent modules, as, for example, device drivers. The uppermost layers lean on abstract model of a computer, and do not give any attention of the hardware. Failure-safe operating systems too are under construction by a hierarchical principle. An example of fault tolerance free distributed operating system UNIX is FreeBSD. The structure of this operating system version developed by us is shown below.

Fault tolerance operating system

Unceasing functioning of system is reached due to the developed mechanism of automatic restoration after failure. If any process is configured as failure-safe in system always there are its two copies which are placed on physically various processors. One copy is active process, and another - reserve. If the basic process interrupts because of unstable failures in work hardware or the software reserve process incurs functions of the core and continues calculations from a point of interruption.

As contents of both processes memory completely coincide, the second processor has the same access to all system resources, as well as the basic, therefore any additional measures for restoration of the interrupted process is not required. The operating system at once creates on other CPU new reserve process so reserve duplication of processes is kept.

Functions of the layers most close to the hardware:

  • Check of a working condition of the processor (whether works, voting);
  • Switching-off of the given up processors;
  • Redistribution of the given up processors functions, maintenance of integrity a cache-memory and memories;
  • Support of failure-safe peripheral devices (for example disk systems RAID).

Higher layers carry out such problems as processes restart, re-configuration network interfaces at a level of the protocol (a feather assignment of TCP/IP address to the interface of a secondary network which becomes more active after failure of a primary network), start of again configured network, etc.

For applications the system gives typical interface UNIX. Functional expansion of operating system can be carried out by means of precisely certain interface of applied programs (API) which in UNIX environment is usually realized through system calls or with use of library functions. The application can use all these opportunities or work without them in not safe (from the refusals point of view) a mode.