Page 168 - A Practical Guide from Design Planning to Manufacturing
P. 168

Microarchitecture  141

        by designs from the opposite camp. The entire industry moves to higher
        performance by improving both frequency and IPC. Benchmarks are the
        best way of measuring processor performance, but they are far from per-
        fect. The biggest problems are choosing benchmark programs, compiler
        optimizations, and system configurations. Different choices for any of
        these may suddenly change which computer scores as the fastest on a
        given benchmark.
          Choosing realistic programs as the obstacle course is perhaps the most
        difficult problem because which programs are really important varies so
        much from one user to another. Someone who plays many 3D games will
        find the performance of floating-point intensive programs relevant, whereas
        someone who does not may care only about the performance of integer pro-
        grams. An added complication is that the applications available change over
        time. The software takes advantage of improving processor performance
        and greater memory capacity by creating new applications that require
        more performance and memory. Benchmarks must regularly update the
        programs they use in order reflect current software. The SPEC integer
        benchmark program list was first created in 1988, but was updated in 1992,
        1996, and 2000.
          Processors with large caches tend to look better running older bench-
        marks. Processor cache memory has increased quickly over time but not
        nearly as dramatically as the size of typical applications. Usually a
        processor will have nowhere near enough cache to hold all the code or
        instructions of a contemporary program. However, it might have a large
        enough cache to hold all of a benchmark, especially if that benchmark
        is a few years old. This creates a strong temptation for some companies
        to quote older benchmarks long after they have ceased to be represen-
        tative of current software.
          In order to be run on processors of different architectures, benchmarks
        must be written in high-level programming languages, but this means
        that the compiler that translates the code for execution on a particular
        processor can have a very large impact on the measured performance.
        Some microarchitectures rely more upon the compiler for reordering
        while others are better able to compensate for a simpler compiler by
        reordering at run time. Even a simple compiler design typically has
        many different optimizations that can be selected. Some will expose
        more parallelism at the cost of increasing code size. This may improve
        or hurt performance depending upon the details of the microarchitecture.
          Processors with complex hardware reordering look relatively better
        when making comparisons with simple compilers. Processors that need
        sophisticated compilers have helped drive improvements in compiler
        technology and these advances have been part of the steady progress in
        computer performance. However, sometimes undue attention is given to
        compiler optimizations that show dramatic improvements in performance
   163   164   165   166   167   168   169   170   171   172   173