Page 311 - DSP Integrated Circuits
P. 311

296                                             Chapter 7 DSP System Design

        gray lines indicate that the delay elements carry signal values from one sample
        interval to the next.
            me computation grapn snown
        in Figure 7.12 can be directly inter-
        preted as a schedule for the arith-
        metic operations. Five multipliers
        and two adders are needed. The
        number of time units of storage is
        34. Clearly, a better schedule can be
        found. The operations can be sched-
        uled to minimize the number of con-
        current operations of the same type.
        The scheduling is done by inter-
        changing the order of operations and
        delays in the same branch. Note
        that the order of two shift-invariant
        operations can be changed, as was
        discussed in Chapter 6.
            An improved schedule is shown
        in Figure 7.13. The amount of stor-  Figure 7.12 Computation graph for direct
        age has increased to 38 time units           form II with pipelining
        while the numbers of multipliers
        ana aaaers nave oeen reaucea to
        three and one, respectively. Thus, fewer PEs, but more storage, are needed for this
        schedule. It is obvious from Figure 7.13 that there are many possible schedules
        that require the same number of processing elements, but have different require-
        ments on storage and communication resources, i.e., communication channels
        between processing elements and memories.
            Note that the average number
        of multipliers required is only (5 •
        4)/10 = 2. This suggests that it may
        be possible to find a more efficient
        schedule that uses only two multipli-
        ers. However, it is not possible to
        find such a schedule while consider-
        ing the operations during a single
        sample interval only. In the follow-
        ing sections we will show that it is
        necessary to perform the scheduling
        over several sample intervals.
            Generally, a trade-off between
        computational resources and the
        amount of memory can be made by
        proper scheduling. Often, the com-
        munication cost is significant and
        should therefore also be taken into
        account.                           Figure 7.13 Improved schedule requiring
                                                     fewer PEs
   306   307   308   309   310   311   312   313   314   315   316