Page 274 - DSP Integrated Circuits
P. 274

6.8 Interleaving and Pipelining                                       259

            To summarize, the two critical paths are broken into pieces of equal size. The
        last stage consists of only one addition. The wave-flow graph is partitioned into 10
        paths, each performing an addition and a "half-multiplication. Hence, the multi-
        pliers should have two pipeline stages. The last stage in the upper branch corre-
        sponds to a shimming delay of length T mi n. The group delay of the filter has,
        because of the pipelining, increased by a constant factor Ar^ = 6 T mi n.





        6.8.3 Functional and Structural Pipelines

        Both algorithms and hardware structures can be pipelined. In classical pipe-
        lines, only one operation is executed by each pipeline stage (PE) separated by
        registers (delay elements). A PE is allocated to each stage of the pipeline. The
        pipeline stages do not share resources. This leads to a fixed and simple communi-
        cation pattern between the processors. This case is often referred to as a struc-
        tural pipeline. Figure 6.53 illustrates a pipelined algorithm with three stages
        and the corresponding structural pipeline where each process has been assigned
        to its own PE.















                               Figure 6.53 Structural pipeline



            A more general case is to partition the pipeline stages into several smaller
        subtasks (operations) that are subsequently mapped onto a hardware structure.
        The operations can of course be executed by a structural pipeline, but a more effi-
        cient way to organize the hardware structure in terms of utilization of the
        resources is to allow the pipeline stages to share resources. That is, operations
        within a pipeline stage are not bounded to a particular PE, and operations that
        are not concurrent may share resources. This is referred to as a functional pipe-
        line.
            Figure 6.54 shows three processes that require different amounts of opera-
        tions within the different stages. For simplicity we neglect precedence relations
        between operations inside the processes. Obviously, nine PEs (six adders and three
        multipliers) are required if a structural pipeline is used.
            Only three PEs (two adders and one multiplier) are required if, instead, the
        operations can be scheduled as shown in Figure 6.54. Pipelining is used here to
        increase the parallelism in the algorithm so that the operations may be scheduled
        more freely.
   269   270   271   272   273   274   275   276   277   278   279