Page 340 -
P. 340

12.3   Reliability specification  323


                        Availability  Explanation

                         0.9        The system is available for 90% of the time. This means that, in a 24-hour period
                                    (1,440 minutes), the system will be unavailable for 144 minutes.

                         0.99       In a 24-hour period, the system is unavailable for 14.4 minutes.

                         0.999      The system is unavailable for 84 seconds in a 24-hour period.
                         0.9999     The system is unavailable for 8.4 seconds in a 24-hour period. Roughly, one minute per week.



                     Figure 12.7          failure. So, POFOD   0.001 means that there is a 1/1,000 chance that a failure
                     Availability         will occur when a demand is made.
                     specification
                                       2.  Rate of occurrence of failures (ROCOF) This metric sets out the probable
                                          number of system failures that are likely to be observed relative to a certain time
                                          period (e.g., an hour), or to the number of system executions. In the example
                                          above, the ROCOF is 1/1,000. The reciprocal of ROCOF is the mean time to
                                          failure (MTTF), which is sometimes used as a reliability metric. MTTF is the
                                          average number of time units between observed system failures. Therefore,
                                          a ROCOF of two failures per hour implies that the mean time to failure is
                                          30 minutes.
                                       3.  Availability (AVAIL) The availability of a system reflects its ability to deliver
                                          services when requested. AVAIL is the probability that a system will be opera-
                                          tional when a demand is made for service. Therefore, an availability of 0.9999,
                                          means that, on average, the system will be available for 99.99% of the operating
                                          time. Figure 12.7 shows what different levels of availability mean in practice.


                                         POFOD should be used as a reliability metric in situations where a failure on
                                       demand can lead to a serious system failure. This applies irrespective of the fre-
                                       quency of the demands. For example, a protection system that monitors a chemical
                                       reactor and shuts down the reaction if it is overheating should have its reliability
                                       specified using POFOD. Generally, demands on a protection system are infrequent
                                       as the system is a last line of defense, after all other recovery strategies have failed.
                                       Therefore a POFOD of 0.001 (1 failure in 1,000 demands) might seem to be risky,
                                       but if there are only two or three demands on the system in its lifetime, then you will
                                       probably never see a system failure.
                                         ROCOF is the most appropriate metric to use in situations where demands on sys-
                                       tems are made regularly rather than intermittently. For example, in a system that han-
                                       dles a large number of transactions, you may specify a ROCOF of 10 failures per
                                       day. This means that you are willing to accept that an average of 10 transactions per
                                       day will not complete successfully and will have to be canceled. Alternatively, you
                                       may specify ROCOF as the number of failures per 1,000 transactions.
                                         If the absolute time between failures is important, you may specify the reliability
                                       as the mean time between failures. For example, if you are specifying the required
   335   336   337   338   339   340   341   342   343   344   345