Page 69 -

P. 69

44 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE

by increasing clock speed. First, there has been an increase in cache capacity. There
are now typically two or three levels of cache between the processor and main mem-
ory.As chip density has increased, more of the cache memory has been incorporated
on the chip, enabling faster cache access. For example, the original Pentium chip de-
voted about 10% of on-chip area to a cache.The most recent Pentium 4 chip devotes
about half of the chip area to caches.
Second, the instruction execution logic within a processor has become in-
creasingly complex to enable parallel execution of instructions within the proces-
sor. Two noteworthy design approaches have been pipelining and superscalar. A
pipeline works much as an assembly line in a manufacturing plant enabling differ-
ent stages of execution of different instructions to occur at the same time along the
pipeline. A superscalar approach in essence allows multiple pipelines within a sin-
gle processor so that instructions that do not depend on one another can be exe-
cuted in parallel.
Both of these approaches are reaching a point of diminishing returns. The in-
ternal organization of contemporary processors is exceedingly complex and is able
to squeeze a great deal of parallelism out of the instruction stream. It seems likely
that further significant increases in this direction will be relatively modest
[GIBB04]. With three levels of cache on the processor chip, each level providing
substantial capacity, it also seems that the benefits from the cache are reaching
a limit.
However, simply relying on increasing clock rate for increased performance
runs into the power dissipation problem already referred to. The faster the clock
rate, the greater the amount of power to be dissipated, and some fundamental phys-
ical limits are being reached.
With all of these difficulties in mind, designers have turned to a fundamentally
new approach to improving performance: placing multiple processors on the same
chip, with a large shared cache.The use of multiple processors on the same chip, also
referred to as multiple cores, or multicore, provides the potential to increase perfor-
mance without increasing the clock rate.Studies indicate that,within a processor,the
increase in performance is roughly proportional to the square root of the increase in
complexity [BORK03]. But if the software can support the effective use of multiple
processors, then doubling the number of processors almost doubles performance.
Thus, the strategy is to use two simpler processors on the chip rather than one more
complex processor.
In addition, with two processors, larger caches are justified. This is important
because the power consumption of memory logic on a chip is much less than that of
processing logic. In coming years, we can expect that most new processor chips will
have multiple processors.

2.3 THE EVOLUTION OF THE INTEL x86 ARCHITECTURE

Throughout this book, we rely on many concrete examples of computer design and
implementation to illustrate concepts and to illuminate trade-offs. Most of the time,
the book relies on examples from two computer families: the Intel x86 and the
ARM architecture. The current x86 offerings represent the results of decades of

64 65 66 67 68 69 70 71 72 73 74