Page 270 - DSP Integrated Circuits

P. 270

6.8 Interleaving and Pipelining 255

processor becomes slow and inefficient. In such cases it may be better to use spe-
cialized PEs for the different types of operations.

6.8.2 Pipelining

Pipelining, which is another method of increasing the throughput of a sequential
algorithm, can be used if the application permits the algorithmic delay to be
increased. This is usually the case when the system (algorithm) is not inside a
recursive loop, but there are many cases when the algorithmic delay must be kept
wiLiim certain, minus.
Pipelining of the
three processes shown in
Figure 6.43 is accom-
plished by propagating
algorithmic delay ele-
ments into the original
critical path so that a set
Figure 6.45 Pipelining of three processes
of new and shorter critical
path(s) is obtained as
illustrated in Figure 6.45.
Ideally the critical path is broken into paths of equal length. Obviously, for
each of the three processes a new computation can begin as soon as the results of
the current computations are stored in the corresponding (output) delay elements.
At steady state, three successive output value computations are performed concur-
rently. A pipeline with n stages allows n computations to be processed concur-
rently and attains a speedup of n over sequential processing. Throughput is the
inverse of the longest critical path, i.e., the same throughput as with interleaving.
Note that the maximum sample rate is, in principle, "infinite" for a nonrecursive
structure (for example, an FIR filter), but this requires that the critical path be
divided into infinitesimally small pieces. Thus, the group delay of the filter will be
infinite.
The latency is changed only if the critical paths are of unequal length. In this
case, equalizing delays must be inserted into the shorter paths so that the effective
lengths become equal. The latency is thereby increased by the same amount as the
sum of all inserted equalizing delays.
The main differences between interleaving and pipelining are that in the lat-
ter case the output is delayed by two sample periods and that the amount of
resources is somewhat reduced. The number of PEs is still three, but each PE is
now required to execute only one process per sample period. Hence a specialized,
and thereby less expensive, PE may be used for each process. In pipelining, each
PE operates on different parts of each process while in interleaving a PE operates
on every nth sample.
Pipelining can be applied to sequential algorithms at any level of abstraction.
Typically, both basic DSP algorithms and arithmetic operations are pipelined. At
the digital circuit level pipelining corresponds to inserting latches between differ-
ent circuit levels so that each level has the same transistor depth. Note that paral-
lelism in a structure is a fundamental property that cannot be changed. By
inserting delay elements into the critical path (for example, by pipelining), a new
structure with a higher degree of parallelism is obtained.

265 266 267 268 269 270 271 272 273 274 275