Page 241 -
P. 241

212           PART TWO  MANAGING SOFTWARE PROJECTS


                          A comprehensive discussion of statistical SQA is beyond the scope of this book.
                       Interested readers should see [SCH98], [KAP95], or [KAN95].



                 8.8   SOFTWARE RELIABILITY
                       There is no doubt that the reliability of a computer program is an important element
                       of its overall quality. If a program repeatedly and frequently fails to perform, it mat-
                       ters little whether other software quality factors are acceptable.
                          Software reliability, unlike many other quality factors, can be measured directed and
         WebRef
                       estimated using historical and developmental data. Software reliability is defined in sta-
         The Reliability Analysis
         Center provides much  tistical terms as "the probability of failure-free operation of a computer program in a
         useful information on  specified environment for a specified time" [MUS87]. To illustrate, program X is estimated
         reliability, maintainability,  to have a reliability of 0.96 over eight elapsed processing hours. In other words, if pro-
         supportability, and quality
         at rac.iitri.org  gram X were to be executed 100 times and require eight hours of elapsed processing
                       time (execution time), it is likely to operate correctly (without failure) 96 times out of 100.
                          Whenever software reliability is discussed, a pivotal question arises: What is meant
                       by the term failure? In the context of any discussion of software quality and reliabil-
                       ity, failure is nonconformance to software requirements. Yet, even within this defin-
                       ition, there are gradations. Failures can be only annoying or catastrophic. One failure
                       can be corrected within seconds while another requires weeks or even months to
                       correct. Complicating the issue even further, the correction of one failure may in fact
                       result in the introduction of other errors that ultimately result in other failures.

                       8.8.1  Measures of Reliability and Availability
                       Early work in software reliability attempted to extrapolate the mathematics of hard-
                       ware reliability theory (e.g., [ALV64]) to the prediction of software reliability. Most
                       hardware-related reliability models are predicated on failure due to wear rather than
                       failure due to design defects. In hardware, failures due to physical wear (e.g., the
         Software reliability
         problems can almost  effects of temperature, corrosion, shock) are more likely than a design-related fail-
         always be traced to  ure. Unfortunately, the opposite is true for software. In fact, all software failures can
         errors in design or  be traced to design or implementation problems; wear (see Chapter 1) does not enter
         implementation.  into the picture.
                          There has been debate over the relationship between key concepts in hardware
                       reliability and their applicability to software (e.g., [LIT89], [ROO90]). Although an
                       irrefutable link has yet be be established, it is worthwhile to consider a few simple
                       concepts that apply to both system elements.
                          If we consider a computer-based system, a simple measure of reliability is mean-
                       time-between-failure (MTBF), where

                            MTBF = MTTF + MTTR
                       The acronyms MTTF and MTTR are mean-time-to-failure and mean-time-to-repair,
                       respectively.
   236   237   238   239   240   241   242   243   244   245   246