Page 397 - DSP Integrated Circuits
P. 397
382 Chapter 8 DSP Architectures
In the case of homogeneous PEs capacity can be adjusted to the application by
multiplexing the PEs. Other advantages are that the PEs can be optimized to their
individual work load and that the interconnection network is simple and fixed. A
typical example of broadcasting data can be found in Chapter 11 where an imple-
mentation of a discrete cosine transform using several PEs is proposed. All the
PEs operate on the same set of input values. Each PE computes one of the DCT
components.
Interprocessor Communication
A common method of reducing the
number of memory transactions is
to provide communication channels
between the PEs, as illustrated in
Figure 8.29. In general, some mem-
ory cells must be placed between
the PEs (pipelining). Popular uses
for this type of architectures are
systolic and wave front arrays,
which combine pipelining and par-
allel processing. This important
class of architectures will be dis- Figure 8.29 Architecture with direct
cussed in Chapter 11. interprocessor communication
EXAMPLE 8.2
Figure 8.30 shows a shared-memory
architecture for an FFT using a single,
but composite, butterfly PE. The archi-
tecture has direct interprocessor com-
munication and uses complex data. The
real and imaginary parts of a complex
value are stored as one word in the
memory—i.e., at the same address. Dis-
cuss balance between computation and
memory bandwidth.
Since only one memory location is
needed for each complex value, the
inequality (8.1) reduces to
where
Thus, the computation versus
communication balance is reasonably
good if we assume that
Figure 8.30 FFT architecture with a
single butterfly