Page 176 - A Practical Guide from Design Planning to Manufacturing

P. 176

Microarchitecture 149

Figure 5-12 shows two processors, each with their own cache, at three
different moments in time. Normally there would also be a Northbridge
chip handling communication with memory but for simplicity this has
been left out. Each cache line has a flag showing that line’s MESI state,
a tag holding the address of the data stored, and the data itself. To start
out, processor A’s cache is empty with all lines invalid, and processor B’s
cache exclusively owns the line from address 1. When processor A reads
the lines from address 1 and 2, processor B snoops the bus and sees the
request. Processor B ignores the request for line 2, which is not in its
cache, but it does have line 1. The cache line state must be updated to
share because now processor A’s cache will have it as well. When proces-
sor A writes both cache lines, it writes line 2 without a bus transaction.
Because it is the exclusive owner, it does not need to communicate that
this write has happened. However, line 1 is shared, which means proces-
sor A must signal that this line has been written. Processor B snoops this
write transaction and marks its own copy invalid. Processor A updates
its copy to modified.
Only through this careful bookkeeping and communication can caches
be safely used to improve performance without causing logical errors.

Branch prediction
One type of specialized cache used in modern microarchitectures is a
branch prediction cache. Branches create a special problem for pipelined
and out-of-order processors. Because they can alter the control flow, all
the instructions after them depend upon their result. This control
dependency affects not just the execution of later instructions, but
whether they should be fetched at all. For many programs, 20 percent
6
or more of the instructions are branches . No pipelined processor could
hope to achieve any reasonable speedup without some mechanism for
dealing with branch control dependencies. The most common method is
branch prediction (Fig. 5-13). The processor simply guesses which
instruction should be fetched next.

No Correct Incorrect
prediction prediction prediction
Branch Branch Branch
Wait Instr 1 Instr 50
Instr 1 Clear Figure 5-13 Branch prediction.
Instr 1

6
Hennessy and Patterson, Computer Architecture, 109.

171 172 173 174 175 176 177 178 179 180 181