Page 1127 - The Mechatronics Handbook

P. 1127

The time required to ﬁnish N instructions in a pipeline with K stages can be calculated. Assume a
cycle time of T for the overall instruction completion, and an equal T/K processing delay at each stage.
With a pipeline scheme, the ﬁrst instruction completes the pipeline after T, and there will be a new
instruction out of the pipeline per stage delay T/K. Therefore, the delays of executing N instructions with
and without pipelining, respectively, are

T * N() (42.1)

T + ( T/k) * N 1) (42.2)
(
–
There is an initial delay in the pipeline execution model before each stage has operations to execute.
The initial delay is usually called pipeline start-up delay (P), and is equal to total execution time of one
instruction. The speed-up of a pipelined machine relative to a nonpipelined machine is calculated as

P * N
---------------------------- (42.3)
–
P + ( N 1)
When N is much larger than the number of pipestages P, the ideal speed-up approaches P. This is an
intuitive result since there are P parts of the machine working in parallel, allowing the execution to go
about P times faster in ideal conditions.
The overlap of sequential instructions in a processor pipeline is shown in Fig. 42.4(b). The instruction
pipeline becomes full after the pipeline delay of P = 5 cycles. Although the pipeline conﬁguration executes
operations in each stage of the processor, two important mechanisms are constructed to ensure correct
functional operation between dependent instructions in the presence of data hazards. Data hazards occur
when instructions in the pipeline generate results that are necessary for later instructions that are already
started in the pipeline. In the pipeline conﬁguration of Fig. 42.4(a), register operands are initially retrieved
during the decode stage. However, the execute and memory stage can deﬁne register operands and contain
the correct current value but are not able to update the register ﬁle until the later write-back execution
stage. Forwarding (or bypassing) is the action of retrieving the correct operand value for an executing
instruction between the initial register ﬁle access and any pending instruction’s register ﬁle updates.
Interlocking is the action of stalling an operation in the pipeline when conditions cause necessary register
operand results to be delayed. It is necessary to stall early stages of the machine so that the correct results
are used, and the machine does not proceed with incorrect values for source operands. The primary
causes of delay in pipeline execution are initiated due to instruction fetch delay and memory latency.
Branch Prediction
Branch instructions pose serious problems for pipelined processors by causing hardware to fetch and
execute instructions until the branch instructions are completed. Executing incorrect instructions can
result in severe performance degradation through the introduction of wasted cycles into the instruction
stream.
There are several methods for dealing with pipeline stalls caused by branch instructions. The simplest
performance scheme handles branches by treating every branch as either taken or not taken. This treat-
ment can be set for every branch or determined by the branch opcode. The designation allows the pipeline
to continue to fetch instructions as if the branch was a normal instruction. However, the fetched instruction
may need to be discarded and the instruction fetch restarted when the branch outcome is incorrect.
Delayed branching is another scheme which treats the set of sequential instructions following a branch
as delay slots. The delay-slot instructions are executed whether or not the branch instruction is taken.
Limitations on delayed branches are caused by the compiler and program characteristics being unable
to support numerous instructions that execute independent of the branch direction. Improvements have
been introduced to provide nullifying branches, which include a predicted direction for the branch. When
the prediction is incorrect, the delay-slot instructions are nulliﬁed.

1122 1123 1124 1125 1126 1127 1128 1129 1130 1131 1132