Page 170 - A Practical Guide from Design Planning to Manufacturing
P. 170
Microarchitecture 143
which instruction reordering or better sharing of resources might elim-
inate. When dependencies can often but not always be eliminated,
modern microarchitectures are designed to “guess” by predicting the
most common behavior, spending extra time to get the correct result
after a wrong guess. Some of the most important microarchitectural
concepts of recent processors are:
Cache memories
Cache coherency
Branch prediction
Register renaming
Microinstructions
Reorder, replay, retire
All of these ideas seek to improve IPC and are discussed in more
detail in the following sections.
Cache memory
Storing recently used values in cache memories to reduce average latency
is an idea used over and over again in modern microarchitectures.
Instructions, data, virtual memory translations, and branch addresses are
all values commonly stored in caches in modern processors. Chapter 2 dis-
cussed how multiple levels of cache work together to create a memory hier-
archy that has lower average latency than any single level of cache could
achieve. Caches are effective at reducing average latency because pro-
grams tend to access data and instructions in regular patterns.
A program accessing a particular memory location is very likely to
access the same memory location again in the near future. On any given
day we are much more likely to look in a drawer that we also opened
the day before than to look in a drawer that has been closed for months.
If we used a particular knife to make lunch, we are far more likely to
use the same knife to make dinner than some utensil in a drawer that
hasn’t been opened for months. Computers act in the same way, being
more likely to access memory locations recently accessed. This is called
temporal locality; similar accesses tend to be close together in time.
By holding recently accessed values in caches, processors provide
reduced latency when the same value is needed repeatedly. In addition,
nearby values are likely to be needed in the near future. A typical pro-
gram accessing memory location 100 is extremely likely to need location
101 as well. This grouping of accesses in similar locations is called
spatial locality.