Page 69 -
P. 69

44  CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE

                  by increasing clock speed. First, there has been an increase in cache capacity. There
                  are now typically two or three levels of cache between the processor and main mem-
                  ory.As chip density has increased, more of the cache memory has been incorporated
                  on the chip, enabling faster cache access. For example, the original Pentium chip de-
                  voted about 10% of on-chip area to a cache.The most recent Pentium 4 chip devotes
                  about half of the chip area to caches.
                       Second, the instruction execution logic within a processor has become in-
                  creasingly complex to enable parallel execution of instructions within the proces-
                  sor. Two noteworthy design approaches have been pipelining and superscalar. A
                  pipeline works much as an assembly line in a manufacturing plant enabling differ-
                  ent stages of execution of different instructions to occur at the same time along the
                  pipeline. A superscalar approach in essence allows multiple pipelines within a sin-
                  gle processor so that instructions that do not depend on one another can be exe-
                  cuted in parallel.
                       Both of these approaches are reaching a point of diminishing returns. The in-
                  ternal organization of contemporary processors is exceedingly complex and is able
                  to squeeze a great deal of parallelism out of the instruction stream. It seems likely
                  that further significant increases in this direction will be relatively modest
                  [GIBB04]. With three levels of cache on the processor chip, each level providing
                  substantial capacity, it also seems that the benefits from the cache are reaching
                  a limit.
                       However, simply relying on increasing clock rate for increased performance
                  runs into the power dissipation problem already referred to. The faster the clock
                  rate, the greater the amount of power to be dissipated, and some fundamental phys-
                  ical limits are being reached.
                       With all of these difficulties in mind, designers have turned to a fundamentally
                  new approach to improving performance: placing multiple processors on the same
                  chip, with a large shared cache.The use of multiple processors on the same chip, also
                  referred to as multiple cores, or multicore, provides the potential to increase perfor-
                  mance without increasing the clock rate.Studies indicate that,within a processor,the
                  increase in performance is roughly proportional to the square root of the increase in
                  complexity [BORK03]. But if the software can support the effective use of multiple
                  processors, then doubling the number of processors almost doubles performance.
                  Thus, the strategy is to use two simpler processors on the chip rather than one more
                  complex processor.
                       In addition, with two processors, larger caches are justified. This is important
                  because the power consumption of memory logic on a chip is much less than that of
                  processing logic. In coming years, we can expect that most new processor chips will
                  have multiple processors.


             2.3 THE EVOLUTION OF THE INTEL x86 ARCHITECTURE


                  Throughout this book, we rely on many concrete examples of computer design and
                  implementation to illustrate concepts and to illuminate trade-offs. Most of the time,
                  the book relies on examples from two computer families: the Intel x86 and the
                  ARM architecture. The current x86 offerings represent the results of decades of
   64   65   66   67   68   69   70   71   72   73   74