Page 312 -
P. 312

11.2   Availability and reliability  295











                                                      Cost








                     Figure 11.2                       Low    Medium    High    Very    Ultra-
                     Cost/dependability                                         High    High
                     curve                                          Dependability


                                11.2 Availability and reliability


                                       System availability and reliability are closely related properties that can both be
                                       expressed as numerical probabilities. The availability of a system is the probability
                                       that the system will be up and running to deliver these services to users on request.
                                       The reliability of a system is the probability that the system’s services will be deliv-
                                       ered as defined in the system specification. If, on average, 2 inputs in every 1,000
                                       cause failures, then the reliability, expressed as a rate of occurrence of failure, is
                                       0.002. If the availability is 0.999, this means that, over some time period, the system
                                       is available for 99.9% of that time.
                                         Reliability and availability are closely related but sometimes one is more impor-
                                       tant than the other. If users expect continuous service from a system then the system
                                       has a high availability requirement. It must be available whenever a demand is made.
                                       However, if the losses that result from a system failure are low and the system can
                                       recover quickly then failures don’t seriously affect system users. In such systems, the
                                       reliability requirements may be relatively low.
                                         A telephone exchange switch that routes phone calls is an example of a system where
                                       availability is more important than reliability. Users expect a dial tone when they pick
                                       up a phone, so the system has high availability requirements. If a system fault occurs
                                       while a connection is being set up, this is often quickly recoverable. Exchange switches
                                       can usually reset the system and retry the connection attempt. This can be done very
                                       quickly and phone users may not even notice that a failure has occurred. Furthermore,
                                       even if a call is interrupted, the consequences are usually not serious. Therefore, avail-
                                       ability rather than reliability is the key dependability requirement for this type of system.
                                         System reliability and availability may be defined more precisely as follows:
                                       1.  Reliability The probability of failure-free operation over a specified time, in
                                          a given environment, for a specific purpose.
   307   308   309   310   311   312   313   314   315   316   317