Page 175 -
P. 175

4.6 / RECOMMENDED READING 145

                       The write buffer operates as follows:When the processor performs a write to a
                  bufferable area, the data are placed in the write buffer at processor clock speed and
                  the processor continues execution. A write occurs when data in the cache are writ-
                  ten back to main memory. Thus, the data to be written are transferred from the
                  cache to the write buffer.The write buffer then performs the external write in paral-
                  lel. If, however, the write buffer is full (either because there are already the maxi-
                  mum number of words of data in the buffer or because there is no slot for the new
                  address) then the processor is stalled until there is sufficient space in the buffer. As
                  non-write operations proceed, the write buffer continues to write to main memory
                  until the buffer is completely empty.
                       Data written to the write buffer are not available for reading back into the
                  cache until the data have transferred from the write buffer to main memory. This is
                  the principal reason that the write buffer is quite small. Even so, unless there is a
                  high proportion of writes in an executing program, the write buffer improves
                  performance.




             4.6 RECOMMENDED READING

                  [JACO08] is an excellent, up-to-date treatment of cache design. Another thorough treat-
                  ment is [HAND98].A classic paper that is still well worth reading is [SMIT82]; it surveys the
                  various elements of cache design and presents the results of an extensive set of analyses.An-
                  other interesting classic is [WILK65], which is probably the first paper to introduce the con-
                  cept of the cache. [GOOD83] also provides a useful analysis of cache behavior. Another
                  worthwhile analysis is [BELL74]. [AGAR89] presents a detailed examination of a variety of
                  cache design issues related to multiprogramming and multiprocessing. [HIGB90] provides a
                  set of simple formulas that can be used to estimate cache performance as a function of vari-
                  ous cache parameters.


                   AGAR89 Agarwal,A. Analysis of Cache Performance for Operating Systems and Multi-
                        programming. Boston: Kluwer Academic Publishers, 1989.
                   BELL74   Bell, J.; Casasent, D.; and Bell, C.“An Investigation into Alternative Cache Or-
                        ganizations.” IEEE  Transactions on Computers, April 1974. http://research
                        .microsoft.com/users/GBell/gbvita.htm.
                   GOOD83 Goodman, J. “Using Cache Memory to Reduce Processor-Memory Band-
                        width.” Proceedings, 10th Annual International Symposium on Computer Architec-
                        ture, 1983. Reprinted in [HILL00].
                   HAND98 Handy, J. The Cache Memory Book. San Diego:Academic Press, 1993.
                   HIGB90   Higbie, L.“Quick and Easy Cache Performance Analysis.” Computer Architec-
                        ture News, June 1990.
                   JACO08 Jacob, B.; Ng, S.; and Wang, D. Memory Systems: Cache, DRAM, Disk. Boston:
                        Morgan Kaufmann, 2008.
                   SMIT82   Smith,A.“Cache Memories.” ACM Computing Surveys, September 1992.
                   WILK65 Wilkes, M.“Slave Memories and Dynamic Storage Allocation,” IEEE Transac-
                        tions on Electronic Computers, April 1965. Reprinted in [HILL00].
   170   171   172   173   174   175   176   177   178   179   180