Page 381 - DSP Integrated Circuits
P. 381

366                                              Chapter 8 DSP Architectures

        of these assignments determine the communication requirement—i.e., communi-
        cation channels and their bandwidth, etc. Hence, the minimum requirements are
        specified in these design steps. Therefore, to each static schedule corresponds a
        class of ideal multiprocessor architectures.
            An ideal DSP architecture belongs to a class of architectures that implements
        the static schedule. An ideal architecture has processing elements that can exe-
        cute the operations according to the schedule and is supported with appropriate
        communication channels and memories.
            Note that there may be several architectures that implement a given sched-
        ule, and that a new class of architectures is obtained if the schedule is changed.
        Algorithms that require dynamic scheduling lead to architectures that either must
        handle worst-case situations or are optimized in a statistical sense. However, the
        execution time must be predictable since the sample period constraint must be
        met in hard real-time applications [13]. The latter type of architectures are there-
        fore difficult to use.


        8.4.1 Processing Elements

        Processing elements (PEs) usually perform simple, memoryless mappings of the
        input values to a single output value. The arithmetic operations commonly used in
        DSP algorithms are
            Add/sub, add/sub-and-shift
            Multiply, multiply-and-accumulate
            Vector product
            Two-port adaptor
            Butterfly
            We will reserve the more general term processor to denote a PE with its inter-
        nal memory and control circuitry. Hence, a processor is able to perform a task
        independently or otner processors.
            If several processing elements always operate
        on the same inputs, it may be advantageous to
        merge these into one PE with multiple inputs and
        outputs—for example, two-port adaptors and but-
        terflies. Experience indicates that it is advanta-
        geous to use the largest operations possible (i.e.,
        large PE granularity) since this tends to reduce
        the communication. However, flexibility in sched-
        uling the operations is reduced and resource utili-
        zation may become poor if the operations chosen
        are too large. As always, a good trade-off is the  Figure 8.10 Processing
        best.                                                      element with
            At mis point it is interesting to note tnat tne        multiple inputs
        execution time for processing elements and the
        cycle time (read and write) for memories manufactured in the same technology are
        of the same order. Hence, to fully utilize a multiple-input processing element, as
        shown in Figure 8.10, one memory or memory port must be provided for each
        input and output value.
   376   377   378   379   380   381   382   383   384   385   386