Page 171 - A Practical Guide from Design Planning to Manufacturing
P. 171
144 Chapter Five
If I read a page from the middle of a book, I probably will read the next
page as well. If I use a tool from my toolbox, it is likely that I will also
need another tool from the same box. If program behavior was random,
then caches could provide no performance improvement, but together
temporal and spatial locality make it possible for many different types
of caches to achieve very high hit rates. Of course, any cache will have
some misses where the needed data is not found in the cache, and dif-
ferent programs will exhibit varying amounts of locality giving differ-
ent miss rates. Looking at the different causes of cache misses can help
in finding ways of improving performance.
Types of cache misses are as follows:
Capacity miss. Cache is not large enough to store all the needed values.
Conflict miss. Needed value was replaced during another recent miss.
Cold miss. Memory location is being accessed for the first time.
A capacity miss occurs when the cache is not large enough to hold all
the needed values. A program that serially accesses 1 MB of memory will
miss when starting the same sweep over again if run on a processor with
a cache smaller than 1 MB. The performance of some programs varies
dramatically when run on processors with caches slightly larger or
slightly smaller than the block of data the program is working upon.
Conflict misses occur when there is sufficient capacity to store all the
needed values, but the cache has made a poor choice in which values
should be replaced as new values are loaded. If the cache design chooses
the same cache location to store two different needed values, these
values may conflict with one another and cause a high miss rate even
if the total amount of data being accessed is small.
A cache is said to be “cold” when a new program first begins to run.
Because the program has just started, the cache has not yet built up a
store of values likely to be needed. Hit rates will be very low for the first
accesses. Each of these types of misses is affected by the choice of cache
size, associativity, and line size.
The cache size is simply the amount of data it can store. Increasing
cache size is the easiest way to improve hit rates by reducing capacity
misses. This performance improvement comes at the cost of larger die size.
Alarge cache will also increase the cache delay, because the more memory
cells in a cache, the longer it takes to access. This means increasing the
cache size will usually also require increasing the number of clock cycles
allocated for a cache access or reducing the processor clock frequency.
Either of these will cause a loss in performance that may easily offset the
improvement gained by a better hit rate. Dividing a large cache into
multiple levels allows a microarchitecture to balance size and latency to
provide the best overall performance.