Page 28 - Building A Succesful Board-Test Strategy
P. 28
14 BUILDING A SUCCESSFUL BOARD-TEST STRATEGY
Consider a board on which U14 pin 12 is tied to ground. Injecting that fault
into the simulation produces one of three results. If the input vector never causes
pin 12 to go high, the output vector will be correct and the simulation does not
detect that fault. If the vector tries to drive the pin high, the output will be incor-
rect. If the faulty output pattern is unique, the simulation has located the specific
failure. If, however, the faulty output signature is identical to the output when
the board contains a different fault, then that vector cannot be used to identity
either fault. Another vector or vectors must be selected, adding to or replacing that
vector in the set.
No matter how carefully an engineer performs fault simulation, some faults
will always go undetected. The philosopher Voltaire once postulated that God is a
comedian playing to an audience that is afraid to laugh. Every now and then, God
plays a little joke to remind us that we are not as smart as we think we are, Very
high estimated fault coverages often fall into that category.
At the device level, in its day the 80386 microprocessor was one of the most
carefully designed components in the industry's history. Intel laid out design-for-
testability rules, made sure that many of the most critical functions were verified
by internal self-test, and otherwise took elaborate steps to guarantee the product's
quality. They estimated a better than 97 percent fault coverage. Yet, despite all of
their efforts, some of the earliest devices exhibited a very peculiar bug. If software
was running in protected mode and the device resided in a particular temperature
range, multiplying two numbers together from a specific numerical range might
produce the wrong answer. The only saving grace was that there was no software
at that time that required the protected mode. Everything had been designed for
the 80286-class environment, which lacked that capability.
The now-classic problem with the original Pentium processors also illustrates
this point, while graphically demonstrating how customer perceptions of such
problems have changed. After the experience with the 80386, Intel redoubled its
efforts in design and test to ensure that subsequent devices (the Pentium contained
a then-staggering 3.2 million transistors on a piece of silicon the size of a dime)
would work properly. To maximize processor calculation speeds, the Pentium
employs a multiplication lookup table, rather than performing each calculation
individually. This table, however, contained a small error in one of the entries. After
thoroughly researching the problem, Intel contended at first that the error was so
tiny that aside from astronomers and others making extremely precise empirical
calculations, almost no computer users would ever notice the discrepancy.
Unfortunately for Intel, this adventure occurred after the proliferation of the
Internet, Electronic bulletin boards everywhere carried news of the problem.
Letters of complaint began to appear from people claiming to have reproduced the
bug. Further investigation showed that these people had indeed performed a
multiplication that had yielded the small discrepancy. This result was unavoidable.
The exact location of the error had been posted on the Internet, and engineers
had downloaded the information to check it out for themselves. Since the error
resided in a lookup table, any calculation that included those specific locations in