Page 194 - A Practical Guide from Design Planning to Manufacturing
P. 194

Microarchitecture  167

          Microinstruction     Reorder buffer  Speculative RAT Retirement RAT  Register file
                             Ready Arch  Physical  Arch  Physical  Arch  Physical
         Uop: Add CX, BX, AX  Entry
                             to retire reg  reg  reg  reg  reg  reg  Entry  Value
         ROB entry: 2
                      Oldest  1  No  AX  R1  AX  R1    AX   R8    1  16
         Phys regs: R3, R2, R1
         Source values: 33, 16  2  No  CX  R3  BX  R2  BX   R12   2  33
         Result: 49                         CX   R3    CX   R15   3  49
        Figure 5-22 Uop at execution.

          When uop execution is complete, the result (if any) is written back into
        one of the register files along with flag values. Flag values store infor-
        mation about the result such as whether it was 0, negative, or an over-
        flow. Any of the flag values can be a condition for a later branch uop.
          At this point the add uop has finally performed its computation and
        written the result into the register file, as shown in Fig. 5-22.


        Retirement and drive
        Having completed execution and generated a result, the processor must
        now determine whether the correct instruction was executed with the cor-
        rect data. If the uop was a branch, its actual evaluation is compared to the
        original prediction. The front-end BTB and trace cache BTB are both
        updated with the branch’s latest behavior. If the branch was mispredicted,
        the trace cache BTB is signaled to begin reading uops from the correct
        address. All uops with entries in the ROB after the mispredicted branch
        should not have been executed at all and will have their results discarded.
          Even if branch prediction was correct, the uop may have operated on
        incorrect data. If an earlier instruction had longer expected latency,
        then the uops that depended upon it may have been scheduled too soon.
        This ROB checks for this and if necessary uops are sent back to the dis-
        patch step to be replayed.
          If branch prediction was correct and no special events occurred, then
        the execution was successful and the uop is retired. Upon retirement, the
        uop’s results are committed to the current correct architectural state by
        updating the retirement RAT and all the resources allocated to the
        instruction are released. The Pentium 4 can retire 3 uops per cycle.
          The add instruction can now retire with satisfaction at a job well
        done (Fig. 5-23).

          Microinstruction      Reorder buffer  Speculative RAT Retirement RAT  Register file
                              Ready Arch  Physical  Arch  Physical  Arch  Physical
         Uop: Add CX, BX, AX  Entry
                              to retire reg  reg  reg  reg  reg  reg  Entry  Value
         ROB entry: 2
                            1  Yes  AX  R1   AX   R1    AX   R1   1   16
         Phys regs: R3, R2, R1
                      Oldest  2  Yes  CX  R3  BX  R2    BX  R12   2   33
         Source values: 33, 16
         Result: 49                          CX   R3    CX   R3   3   49
        Figure 5-23 Uop at retirement.
   189   190   191   192   193   194   195   196   197   198   199