Page 81 -

P. 81

56 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE

is more complex.The ratio is calculated as follows:
N * Tref
r = i (2.7)
i
Tsut i
where Tref is the reference execution time for benchmark i, N is the number of
i
copies of the program that are run simultaneously, and Tsut is the elapsed time from
i
the start of the execution of the program on all N processors of the system under
test until the completion of all the copies of the program. Again, a geometric mean
is calculated to determine the overall performance measure.
SPEC chose to use a geometric mean because it is the most appropriate for
normalized numbers, such as ratios. [FLEM86] demonstrates that the geometric
mean has the property of performance relationships consistently maintained re-
gardless of the computer that is used as the basis for normalization.

Amdahl’s Law
When considering system performance, computer system designers look for ways to
improve performance by improvement in technology or change in design. Examples
include the use of parallel processors, the use of a memory cache hierarchy, and
speedup in memory access time and I/O transfer rate due to technology improve-
ments. In all of these cases, it is important to note that a speedup in one aspect of the
technology or design does not result in a corresponding improvement in perfor-
mance.This limitation is succinctly expressed by Amdahl’s law.
Amdahl’s law was first proposed by Gene Amdahl in [AMDA67] and deals
with the potential speedup of a program using multiple processors compared to a
single processor. Consider a program running on a single processor such that a frac-
tion (1 - f) of the execution time involves code that is inherently serial and a frac-
tion f that involves code that is infinitely parallelizable with no scheduling overhead.
Let T be the total execution time of the program using a single processor. Then the
speedup using a parallel processor with N processors that fully exploits the parallel
portion of the program is as follows:
time to execute program on a single processor
Speedup =
time to execute program on N parallel processors
T(1 - f) + Tf 1
= =
T(1 - f) + Tf (1 - f) + f
N N

Two important conclusions can be drawn:
1. When f is small, the use of parallel processors has little effect.
2. As N approaches infinity, speedup is bound by 1/(1 - f), so that there are
diminishing returns for using more processors.
These conclusions are too pessimistic, an assertion first put forward in
[GUST88]. For example, a server can maintain multiple threads or multiple tasks to
handle multiple clients and execute the threads or tasks in parallel up to the limit of
the number of processors. Many database applications involve computations on
massive amounts of data that can be split up into multiple parallel tasks. Nevertheless,

76 77 78 79 80 81 82 83 84 85 86