Page 194 - A Practical Guide from Design Planning to Manufacturing
P. 194
Microarchitecture 167
Microinstruction Reorder buffer Speculative RAT Retirement RAT Register file
Ready Arch Physical Arch Physical Arch Physical
Uop: Add CX, BX, AX Entry
to retire reg reg reg reg reg reg Entry Value
ROB entry: 2
Oldest 1 No AX R1 AX R1 AX R8 1 16
Phys regs: R3, R2, R1
Source values: 33, 16 2 No CX R3 BX R2 BX R12 2 33
Result: 49 CX R3 CX R15 3 49
Figure 5-22 Uop at execution.
When uop execution is complete, the result (if any) is written back into
one of the register files along with flag values. Flag values store infor-
mation about the result such as whether it was 0, negative, or an over-
flow. Any of the flag values can be a condition for a later branch uop.
At this point the add uop has finally performed its computation and
written the result into the register file, as shown in Fig. 5-22.
Retirement and drive
Having completed execution and generated a result, the processor must
now determine whether the correct instruction was executed with the cor-
rect data. If the uop was a branch, its actual evaluation is compared to the
original prediction. The front-end BTB and trace cache BTB are both
updated with the branch’s latest behavior. If the branch was mispredicted,
the trace cache BTB is signaled to begin reading uops from the correct
address. All uops with entries in the ROB after the mispredicted branch
should not have been executed at all and will have their results discarded.
Even if branch prediction was correct, the uop may have operated on
incorrect data. If an earlier instruction had longer expected latency,
then the uops that depended upon it may have been scheduled too soon.
This ROB checks for this and if necessary uops are sent back to the dis-
patch step to be replayed.
If branch prediction was correct and no special events occurred, then
the execution was successful and the uop is retired. Upon retirement, the
uop’s results are committed to the current correct architectural state by
updating the retirement RAT and all the resources allocated to the
instruction are released. The Pentium 4 can retire 3 uops per cycle.
The add instruction can now retire with satisfaction at a job well
done (Fig. 5-23).
Microinstruction Reorder buffer Speculative RAT Retirement RAT Register file
Ready Arch Physical Arch Physical Arch Physical
Uop: Add CX, BX, AX Entry
to retire reg reg reg reg reg reg Entry Value
ROB entry: 2
1 Yes AX R1 AX R1 AX R1 1 16
Phys regs: R3, R2, R1
Oldest 2 Yes CX R3 BX R2 BX R12 2 33
Source values: 33, 16
Result: 49 CX R3 CX R3 3 49
Figure 5-23 Uop at retirement.