Page 317 -
P. 317
300 Chapter 11 Dependability and security
and monitoring systems in aircraft, process control systems in chemical and
pharmaceutical plants, and automobile control systems.
Hardware control of safety-critical systems is simpler to implement and analyze
than software control. However, we now build systems of such complexity that they
cannot be controlled by hardware alone. Software control is essential because of the
need to manage large numbers of sensors and actuators with complex control laws. For
example, advanced, aerodynamically unstable, military aircraft require continual
software-controlled adjustment of their flight surfaces to ensure that they do not crash.
Safety-critical software falls into two classes:
1. Primary safety-critical software This is software that is embedded as a con-
troller in a system. Malfunctioning of such software can cause a hardware
malfunction, which results in human injury or environmental damage. The
insulin pump software, introduced in Chapter 1, is an example of a primary
safety-critical system. System failure may lead to user injury.
2. Secondary safety-critical software This is software that can indirectly result in
an injury. An example of such software is a computer-aided engineering design
system whose malfunctioning might result in a design fault in the object being
designed. This fault may cause injury to people if the designed system malfunc-
tions. Another example of a secondary safety-critical system is the mental
health care management system, MHC-PMS. Failure of this system, whereby an
unstable patient may not be treated properly, could lead to that patient injuring
themselves or others.
System reliability and system safety are related but a reliable system can be
unsafe and vice versa. The software may still behave in such a way that the resultant
system behavior leads to an accident. There are four reasons why software systems
that are reliable are not necessarily safe:
1. We can never be 100% certain that a software system is fault-free and fault-
tolerant. Undetected faults can be dormant for a long time and software
failures can occur after many years of reliable operation.
2. The specification may be incomplete in that it does not describe the required
behavior of the system in some critical situations. A high percentage of system
malfunctions (Boehm et al., 1975; Endres, 1975; Lutz, 1993; Nakajo and Kume,
1991) are the result of specification rather than design errors. In a study of errors
in embedded systems, Lutz concludes:
. . . difficulties with requirements are the key root cause of the safety-
related software errors, which have persisted until integration and system
testing.
3. Hardware malfunctions may cause the system to behave in an unpredictable
way, and present the software with an unanticipated environment. When compo-
nents are close to physical failure, they may behave erratically and generate
signals that are outside the ranges that can be handled by the software.