Page 147 - A Practical Guide from Design Planning to Manufacturing
P. 147

120   Chapter Four

        RISC versus EPIC
        Probably the most important new architecture since the development
        of RISC is the Explicitly Parallel Instruction Computing (EPIC) archi-
        tecture. The first implementation of this architecture was the Intel
        Itanium, which began shipping in 2001. EPIC architecture was designed
        for implementations with even higher transistor counts than the origi-
        nal RISC processors. Superscalar processors of the 1990s had the func-
        tional units to execute multiple instructions in parallel. However, they
        used a great deal of die area on scheduling circuits used to determine
        which instructions could execute in parallel. One suggested solution to
        this was Very Long Instruction Word (VLIW) architectures.
          VLIW architectures bundle multiple instructions that can be exe-
        cuted in parallel into a single long instruction. The compiler performs
        the scheduling, so that the processor avoids wasting run time and sili-
        con area determining which instructions to execute in parallel. The
        compiler has the ability to look at hundreds of instructions to find inde-
        pendent operations, so in theory it can do a much better job scheduling
        than hardware implementations. VLIW approaches had two significant
        shortcomings.
          One problem was a dramatic increase in code size. If enough inde-
        pendent instructions could not be found to fill all the slots in a VLIW
        instruction, the remaining slots are wasted. If fixed instruction size is
        to be maintained, then each long instruction must be the same length
        no matter how many or few of its execution slots are used. This caused
        programs compiled for VLIW architectures to become much larger than
        even RISC code.
          Another problem was the potential lack of binary compatibility from
        one processor implementation to the next. Because superscalar machines
        use hardware to determine instruction scheduling, new implementa-
        tions can increase the number of instructions that are executed in par-
        allel while maintaining compatibility. If a new VLIW processor had a
        larger instruction width, all the code for the previous generation would
        need to be recompiled for proper scheduling. The EPIC architecture tries
        to take the good points of VLIW architectures while addressing these
        problems.
          The EPIC architecture encodes its instructions into 128-bit-wide bun-
        dles. Each bundle contains three instructions encoded in 41 bits each
        and a 5-bit template field. The template field contains information about
        the types of instructions in the bundle and which instructions can be exe-
        cuted in parallel. This allows all the slots of an instruction to be filled
        even if enough independent instructions cannot be found. The template
        also specifies whether one or more instructions in this bundle can be exe-
        cuted in parallel with at least the first instruction of the next bundle.
        This means there is no limit on how many instructions the compiler
   142   143   144   145   146   147   148   149   150   151   152