Page 165 -
P. 165

136  CHAPTER 4 / CACHE MEMORY

                  Replacement Algorithms
                  Once the cache has been filled, when a new block is brought into the cache, one of
                  the existing blocks must be replaced. For direct mapping, there is only one possible
                  line for any particular block, and no choice is possible. For the associative and set-
                  associative techniques, a replacement algorithm is needed. To achieve high speed,
                  such an algorithm must be implemented in hardware.A number of algorithms have
                  been tried. We mention four of the most common. Probably the most effective is
                  least recently used (LRU): Replace that block in the set that has been in the cache
                  longest with no reference to it. For two-way set associative, this is easily imple-
                  mented. Each line includes a USE bit.When a line is referenced, its USE bit is set to
                  1 and the USE bit of the other line in that set is set to 0. When a block is to be read
                  into the set, the line whose USE bit is 0 is used. Because we are assuming that more
                  recently used memory locations are more likely to be referenced, LRU should give
                  the best hit ratio. LRU is also relatively easy to implement for a fully associative
                  cache. The cache mechanism maintains a separate list of indexes to all the lines in
                  the cache. When a line is referenced, it moves to the front of the list. For replace-
                  ment, the line at the back of the list is used. Because of its simplicity of implementa-
                  tion, LRU is the most popular replacement algorithm.
                       Another possibility is first-in-first-out (FIFO): Replace that block in the set that
                  has been in the cache longest.FIFO is easily implemented as a round-robin or circular
                  buffer technique. Still another possibility is least frequently used (LFU): Replace that
                  block in the set that has experienced the fewest references. LFU could be imple-
                  mented by associating a counter with each line.A technique not based on usage (i.e.,
                  not LRU, LFU, FIFO, or some variant) is to pick a line at random from among the
                  candidate lines. Simulation studies have shown that random replacement provides
                  only slightly inferior performance to an algorithm based on usage [SMIT82].

                  Write Policy

                  When a block that is resident in the cache is to be replaced, there are two cases to
                  consider. If the old block in the cache has not been altered, then it may be overwrit-
                  ten with a new block without first writing out the old block. If at least one write op-
                  eration has been performed on a word in that line of the cache, then main memory
                  must be updated by writing the line of cache out to the block of memory before
                  bringing in the new block. A variety of write policies, with performance and eco-
                  nomic trade-offs, is possible. There are two problems to contend with. First, more
                  than one device may have access to main memory. For example, an I/O module may
                  be able to read-write directly to memory. If a word has been altered only in the
                  cache, then the corresponding memory word is invalid. Further, if the I/O device has
                  altered main memory, then the cache word is invalid. A more complex problem oc-
                  curs when multiple processors are attached to the same bus and each processor has
                  its own local cache. Then, if a word is altered in one cache, it could conceivably in-
                  validate a word in other caches.
                       The simplest technique is called write through. Using this technique, all write
                  operations are made to main memory as well as to the cache, ensuring that main
                  memory is always valid. Any other processor–cache module can monitor traffic to
                  main memory to maintain consistency within its own cache. The main disadvantage
   160   161   162   163   164   165   166   167   168   169   170