Page 28 - Building A Succesful Board-Test Strategy
P. 28

14 BUILDING A SUCCESSFUL BOARD-TEST STRATEGY


    Consider a board on which U14 pin 12 is tied to ground. Injecting that fault
 into the simulation produces one of three results. If the input vector never causes
 pin 12 to go high, the output vector will be correct and the simulation does not
 detect that fault. If the vector tries to drive the pin high, the output will be incor-
 rect. If the faulty output pattern is unique, the simulation has located the specific
 failure. If, however, the faulty output signature is identical to the output when
 the board contains a different fault, then that vector cannot be used to identity
 either fault. Another vector or vectors must be selected, adding to or replacing that
 vector in the set.
    No matter how carefully an engineer performs fault simulation, some faults
 will always go undetected. The philosopher Voltaire once postulated that God is a
 comedian playing to an audience that is afraid to laugh. Every now and then, God
 plays a little joke to remind us that we are not as smart as we think we are, Very
 high estimated fault coverages often fall into that category.
    At the device level, in its day the 80386 microprocessor was one of the most
 carefully designed components in the industry's history. Intel laid out design-for-
 testability rules, made sure that many of the most critical functions were verified
 by internal self-test, and otherwise took elaborate steps to guarantee the product's
 quality. They estimated a better than 97 percent fault coverage. Yet, despite all of
 their efforts, some of the earliest devices exhibited a very peculiar bug. If software
 was running in protected mode and the device resided in a particular temperature
 range, multiplying two numbers together from a specific numerical range might
 produce the wrong answer. The only saving grace was that there was no software
 at that time that required the protected mode. Everything had been designed for
 the 80286-class environment, which lacked that capability.
    The now-classic problem with the original Pentium processors also illustrates
 this point, while graphically demonstrating how customer perceptions of such
 problems have changed. After the experience with the 80386, Intel redoubled its
 efforts in design and test to ensure that subsequent devices (the Pentium contained
 a then-staggering 3.2 million transistors on a piece of silicon the size of a dime)
 would work properly. To maximize processor calculation speeds, the Pentium
 employs a multiplication lookup table, rather than performing each calculation
 individually. This table, however, contained a small error in one of the entries. After
 thoroughly researching the problem, Intel contended at first that the error was so
 tiny that aside from astronomers and others making extremely precise empirical
 calculations, almost no computer users would ever notice the discrepancy.
    Unfortunately for Intel, this adventure occurred after the proliferation of the
 Internet, Electronic bulletin boards everywhere carried news of the problem.
 Letters of complaint began to appear from people claiming to have reproduced the
 bug. Further investigation showed that these people had indeed performed a
 multiplication that had yielded the small discrepancy. This result was unavoidable.
 The exact location of the error had been posted on the Internet, and engineers
 had downloaded the information to check it out for themselves. Since the error
 resided in a lookup table, any calculation that included those specific locations in
   23   24   25   26   27   28   29   30   31   32   33