Page 147 - A Practical Guide from Design Planning to Manufacturing
P. 147
120 Chapter Four
RISC versus EPIC
Probably the most important new architecture since the development
of RISC is the Explicitly Parallel Instruction Computing (EPIC) archi-
tecture. The first implementation of this architecture was the Intel
Itanium, which began shipping in 2001. EPIC architecture was designed
for implementations with even higher transistor counts than the origi-
nal RISC processors. Superscalar processors of the 1990s had the func-
tional units to execute multiple instructions in parallel. However, they
used a great deal of die area on scheduling circuits used to determine
which instructions could execute in parallel. One suggested solution to
this was Very Long Instruction Word (VLIW) architectures.
VLIW architectures bundle multiple instructions that can be exe-
cuted in parallel into a single long instruction. The compiler performs
the scheduling, so that the processor avoids wasting run time and sili-
con area determining which instructions to execute in parallel. The
compiler has the ability to look at hundreds of instructions to find inde-
pendent operations, so in theory it can do a much better job scheduling
than hardware implementations. VLIW approaches had two significant
shortcomings.
One problem was a dramatic increase in code size. If enough inde-
pendent instructions could not be found to fill all the slots in a VLIW
instruction, the remaining slots are wasted. If fixed instruction size is
to be maintained, then each long instruction must be the same length
no matter how many or few of its execution slots are used. This caused
programs compiled for VLIW architectures to become much larger than
even RISC code.
Another problem was the potential lack of binary compatibility from
one processor implementation to the next. Because superscalar machines
use hardware to determine instruction scheduling, new implementa-
tions can increase the number of instructions that are executed in par-
allel while maintaining compatibility. If a new VLIW processor had a
larger instruction width, all the code for the previous generation would
need to be recompiled for proper scheduling. The EPIC architecture tries
to take the good points of VLIW architectures while addressing these
problems.
The EPIC architecture encodes its instructions into 128-bit-wide bun-
dles. Each bundle contains three instructions encoded in 41 bits each
and a 5-bit template field. The template field contains information about
the types of instructions in the bundle and which instructions can be exe-
cuted in parallel. This allows all the slots of an instruction to be filled
even if enough independent instructions cannot be found. The template
also specifies whether one or more instructions in this bundle can be exe-
cuted in parallel with at least the first instruction of the next bundle.
This means there is no limit on how many instructions the compiler