Page 79 -
P. 79
54 CHAPTER 2 / COMPUTER EVOLUTION AND PERFORMANCE
3. It can be measured easily.
4. It has wide distribution.
SPEC BENCHMARKS The common need in industry and academic and research
communities for generally accepted computer performance measurements has led to
the development of standardized benchmark suites.A benchmark suite is a collection
of programs, defined in a high-level language, that together attempt to provide a rep-
resentative test of a computer in a particular application or system programming area.
The best known such collection of benchmark suites is defined and maintained by the
System Performance Evaluation Corporation (SPEC), an industry consortium. SPEC
performance measurements are widely used for comparison and research purposes.
The best known of the SPEC benchmark suites is SPEC CPU2006.This is the in-
dustry standard suite for processor-intensive applications. That is, SPEC CPU2006 is
appropriate for measuring performance for applications that spend most of their time
doing computation rather than I/O. The CPU2006 suite is based on existing applica-
tions that have already been ported to a wide variety of platforms by SPEC industry
members. It consists of 17 floating-point programs written in C, C , and Fortran;
and 12 integer programs written in C and C .The suite contains over 3 million lines
of code.This is the fifth generation of processor-intensive suites from SPEC, replacing
SPEC CPU2000, SPEC CPU95, SPEC CPU92, and SPEC CPU89 [HENN07].
Other SPEC suites include the following:
• SPECjvm98: Intended to evaluate performance of the combined hardware
and software aspects of the Java Virtual Machine (JVM) client platform
• SPECjbb2000 (Java Business Benchmark): A benchmark for evaluating
server-side Java-based electronic commerce applications
• SPECweb99: Evaluates the performance of World Wide Web (WWW) servers
• SPECmail2001: Designed to measure a system’s performance acting as a mail
server
AVERAGING RESULTS To obtain a reliable comparison of the performance of vari-
ous computers, it is preferable to run a number of different benchmark programs on
each machine and then average the results. For example, if m different benchmark
program, then a simple arithmetic mean can be calculated as follows:
1 m
R = R i (2.3)
A
m a
i=1
where R is the high-level language instruction execution rate for the ith benchmark
i
program.
An alternative is to take the harmonic mean:
m
=
R H (2.4)
m
1
a R
i=1 i
Ultimately, the user is concerned with the execution time of a system, not its
execution rate. If we take arithmetic mean of the instruction rates of various bench-
mark programs, we get a result that is proportional to the sum of the inverses of