Page 367 - A Practical Guide from Design Planning to Manufacturing
P. 367

Silicon Debug and Test  337

          Breakpoint mechanisms give even better control of clocking during
        tests. Breakpoints monitor the processor state for specific events. These
        events could be things like a particular type of instruction, a certain
        number of branch mispredicts or cache misses, or a particular exception.
        The signaling of a breakpoint then causes the processor clock to stop
        automatically so that the scan state can be captured and read. This
        allows for the scan capture to be tied to a particular processor behavior
        rather than a particular clock cycle. Nested trigger events allow capture
        to occur after a specific sequence of events that indicates a particular
        complex failure.
          Scan is appropriate for areas of the die that are a mix of logic and
        sequentials. Scan does not work well for memory arrays, which are
        almost entirely made up of sequentials. For large arrays, the area penalty
        of making all the array elements scannable would be excessive. However,
        it is very difficult to write functional tests that will stimulate all the
        memory cells of all the on-die arrays. A common compromise is adding
        array freeze and dump circuits. These allow the array state to be frozen
        by a breakpoint event. This prevents any further writes to the array. Then
        the array dump allows all the bits of the array to be read directly at the
        pins. These values are compared to expected values to check the array
        for defects.
          For very large arrays, dumping the entire contents of the array may take
        excessive test time. Instead, the circuitry to test the array is built on die
        as part of the array itself. This is called built-in self-test (BIST). BIST
        requires a stimulus generator, which writes values into the array, and a
        response analyzer, which checks for the correct read values. BIST adds sig-
        nificant area and complexity, but for the largest arrays on die the area of
        a BIST controller may be tiny compared to that of the array itself.
          Some DFT circuits allow defects not only to be detected but also to be
        bypassed in order to allow the part to be shipped. The simplest method
        for doing this is to disable the part of the die that has the defect. A single
        processor die might be sold as two different server products, one with
        multiprocessor support and one without. If a die has a defect that affects
        only multiprocessor functionality, it may still be sold but only as the
        product that does not support this feature. The same die might also be
        sold as a desktop product with half the on-die cache disabled. Designing
        the cache to allow either half of the full array to be disabled allows any
        die with a single cache defect to still be sold by disabling the half with the
        defect.
          The full cache size can be supported despite defects by adding redun-
        dant rows. These are extra rows of memory cells that are enabled to
        replace defective cells. When a defect is found, fuses are set on the die that
        cause cache accesses to the defective row to be directed to the redundant
        row instead. For processors where on-die cache makes up more than half
   362   363   364   365   366   367   368   369   370   371   372