Page 365 -
P. 365
348 Chapter 13 Dependability engineering
Change management, discussed in Chapter 25, is concerned with managing
changes to a system, ensuring that accepted changes are actually implemented and
confirming that planned releases of the software include the planned changes. One
common problem with software is that the wrong components are included in a system
build. This can lead to a situation where an executing system includes components that
have not been checked during the development process. Configuration management
procedures must be defined as part of the change management process to ensure that
this does not happen.
There is a widely held view that agile approaches, as discussed in Chapter 3, are
not really suitable for dependable processes (Boehm, 2002). Agile approaches focus
on developing the software rather than on documenting what has been done. They
often have a fairly informal approach to change and quality management. Plan-based
approaches to dependable systems development, which create documentation that
regulators and other external system stakeholders can understand, are generally pre-
ferred. Nevertheless, the benefits of agile approaches are equally applicable to criti-
cal systems. There have been reports of successes in applying agile methods in this
area (Lindvall, et al., 2004) and it is likely that variants of agile methods that are suit-
able for critical systems engineering will be developed.
13.3 Dependable system architectures
As I have discussed, dependable systems development should be based around a
dependable process. However, although you probably need a dependable process to
create dependable systems, this is not enough in itself to ensure dependability. You
also need to design a system architecture for dependability, especially when fault tol-
erance is required. This means that the architecture has to be designed to include
redundant components and mechanisms that allow control to be switched from one
component to another.
Examples of systems that may need fault-tolerant architectures are systems in air-
craft that must be in operation throughout the duration of the flight, telecommunica-
tion systems, and critical command and control systems. Pullum (2001) describes
different types of fault-tolerant architecture that have been proposed and Torres-
Pomales surveys software fault-tolerance techniques (2000).
The simplest realization of a dependable architecture is in replicated servers, where
two or more servers carry out the same task. Requests for processing are channeled
through a server management component that routes each request to a particular server.
This component also keeps track of server responses. In the event of server failure,
which is usually detected by a lack of response, the faulty server is switched out of the
system. Unprocessed requests are resubmitted to other servers for processing.
This replicated server approach is widely used for transaction processing systems
where it is easy to maintain copies of transactions to be processed. Transaction
processing systems are designed so that data is only updated once a transaction has
finished correctly so delays in processing do not affect the integrity of the system.

