Page 369 - A Practical Guide from Design Planning to Manufacturing
P. 369

Silicon Debug and Test  339

        much lower frequency than all the others, this is considered a speedpath
        bug. These paths may have to be improved before the product can ship
        at a reasonable frequency.
          The notion of a power bug is relatively new, but new high-power
        processors or processors meant for portable applications may struggle
        to fit within the power limits of their platform. The correct result and
        good performance and frequency are not sufficient. The processor must
        be able to achieve these at a certain target power level. If the power
        required to run an application is much more than expected, this may
        have to be corrected in order to ship. For example, circuits with unin-
        tended contention cause power bugs. A design flaw might cause two
        gates to simultaneously drive the same wire toward opposite voltages.
        Depending on the relative sizes of the gates, the wire may end up at the
        logically correct voltage, but meanwhile the circuit is consuming dra-
        matically more power than it should.
          Circuit marginality checks look for any of the other types of bugs that
        appear only at extremes of voltage, temperature, or process. The die may
        function perfectly below a certain temperature, but begin to show logical
        failures above this temperature. The design is fundamentally correct but
        not robust. To be reliable, the processor must be able to operate success-
        fully throughout a range of process and environmental conditions.
          Logic bugs and performance bugs are caused by unexpected behavior
        from the RTL model or from silicon that does not match the RTL. The
        design flaw may have existed in RTL for years without being noticed
        because the particular sequence of instructions needed to trigger it was
        never attempted. If the RTL does not show the bug, it may be that the
        circuit design has not faithfully reproduced the behavior of the RTL.
        Speedpaths, power bugs, and circuit marginality problems arise because
        circuit behavior does not match simulation. There are pre-silicon checks
        for all of these problems, but these simulations are not foolproof. Flawed
        inputs to the simulator or software bugs in the simulator itself can pre-
        vent design bugs from being detected pre-silicon. Even when used per-
        fectly, these simulations are only approximations of reality. In order to
        keep run time within reason, the simulators must make simplifications
        that are true in the vast majority of cases but not always. The ultimate
        test of a design’s behavior is to test the chips in a real platform.


        Validation platforms and tests
        To find silicon bugs, tests are run on validation platforms. These are
        systems with much or all of the functionality of the real systems that will
        ultimately use the processors. They often also include extra equipment
        specifically to help with post-silicon validation. A logic analyzer may be
        used to directly monitor and stimulate the processor bus. The system can
   364   365   366   367   368   369   370   371   372   373   374