Page 338 -
P. 338
12.3 Reliability specification 321
that compensate for software failure, there may also be related reliability requirements
to help detect and recover from hardware failures and operator errors.
Reliability is different from safety and security in that it is a measurable system
attribute. That is, it is possible to specify the level of reliability that is required, mon-
itor the system’s operation over time, and check if the required reliability has been
achieved. For example, a reliability requirement might be that system failures that
require a reboot should not occur more than once per week. Every time such a fail-
ure occurs, it can be logged and you can check if the required level of reliability has
been achieved. If not, you either modify your reliability requirement or submit a
change request to address the underlying system problems. You may decide to
accept a lower level of reliability because of the costs of changing the system to
improve reliability or because fixing the problem may have adverse side effects,
such as lower performance or throughput.
By contrast, both safety and security are about avoiding undesirable situations,
rather than specifying a desired ‘level’ of safety or security. Even one such situation
in the lifetime of a system may be unacceptable and, if it occurs, system changes
have to be made. It makes no sense to make statements like ‘system faults should
result in fewer than 10 injuries per year.’ As soon as one injury occurs, the system
problem must be rectified.
Reliability requirements are, therefore, of two kinds:
1. Non-functional requirements, which define the number of failures that are
acceptable during normal use of the system, or the time in which the system is
unavailable for use. These are quantitative reliability requirements.
2. Functional requirements, which define system and software functions that
avoid, detect, or tolerate faults in the software and so ensure that these faults do
not lead to system failure.
Quantitative reliability requirements lead to related functional system require-
ments. To achieve some required level of reliability, the functional and design
requirements of the system should specify the faults to be detected and the actions
that should be taken to ensure that these faults do not lead to system failures.
The process of reliability specification can be based on the general risk-driven
specification process shown in Figure 12.1:
1. Risk identification At this stage, you identify the types of system failures that
may lead to economic losses of some kind. For example, an e-commerce system
may be unavailable so that customers cannot place orders, or a failure that cor-
rupts data may require time to restore the system database from a backup and
rerun transactions that have been processed. The list of possible failure types,
shown in Figure 12.6, can be used as a starting point for risk identification.
2. Risk analysis This involves estimating the costs and consequences of different
types of software failure and selecting high-consequence failures for further
analysis.