Page 165 -
P. 165
136 CHAPTER 4 / CACHE MEMORY
Replacement Algorithms
Once the cache has been filled, when a new block is brought into the cache, one of
the existing blocks must be replaced. For direct mapping, there is only one possible
line for any particular block, and no choice is possible. For the associative and set-
associative techniques, a replacement algorithm is needed. To achieve high speed,
such an algorithm must be implemented in hardware.A number of algorithms have
been tried. We mention four of the most common. Probably the most effective is
least recently used (LRU): Replace that block in the set that has been in the cache
longest with no reference to it. For two-way set associative, this is easily imple-
mented. Each line includes a USE bit.When a line is referenced, its USE bit is set to
1 and the USE bit of the other line in that set is set to 0. When a block is to be read
into the set, the line whose USE bit is 0 is used. Because we are assuming that more
recently used memory locations are more likely to be referenced, LRU should give
the best hit ratio. LRU is also relatively easy to implement for a fully associative
cache. The cache mechanism maintains a separate list of indexes to all the lines in
the cache. When a line is referenced, it moves to the front of the list. For replace-
ment, the line at the back of the list is used. Because of its simplicity of implementa-
tion, LRU is the most popular replacement algorithm.
Another possibility is first-in-first-out (FIFO): Replace that block in the set that
has been in the cache longest.FIFO is easily implemented as a round-robin or circular
buffer technique. Still another possibility is least frequently used (LFU): Replace that
block in the set that has experienced the fewest references. LFU could be imple-
mented by associating a counter with each line.A technique not based on usage (i.e.,
not LRU, LFU, FIFO, or some variant) is to pick a line at random from among the
candidate lines. Simulation studies have shown that random replacement provides
only slightly inferior performance to an algorithm based on usage [SMIT82].
Write Policy
When a block that is resident in the cache is to be replaced, there are two cases to
consider. If the old block in the cache has not been altered, then it may be overwrit-
ten with a new block without first writing out the old block. If at least one write op-
eration has been performed on a word in that line of the cache, then main memory
must be updated by writing the line of cache out to the block of memory before
bringing in the new block. A variety of write policies, with performance and eco-
nomic trade-offs, is possible. There are two problems to contend with. First, more
than one device may have access to main memory. For example, an I/O module may
be able to read-write directly to memory. If a word has been altered only in the
cache, then the corresponding memory word is invalid. Further, if the I/O device has
altered main memory, then the cache word is invalid. A more complex problem oc-
curs when multiple processors are attached to the same bus and each processor has
its own local cache. Then, if a word is altered in one cache, it could conceivably in-
validate a word in other caches.
The simplest technique is called write through. Using this technique, all write
operations are made to main memory as well as to the cache, ensuring that main
memory is always valid. Any other processor–cache module can monitor traffic to
main memory to maintain consistency within its own cache. The main disadvantage

