Page 266 - DSP Integrated Circuits
P. 266

6.7 Equivalence Transformations                                       251

        shimming delays (D flip-flops) are needed, but 11 of these are taken from the
        delay elements. The minimum delay between input and output is 11 clock cycles.
        Altogether, 51 D flip-flops are required. The final wave-flow graph, shown in Fig-
        ure 6.37, with proper timing can be readily transformed into a computation
        graph.




        6.7.3 Minimizing the Amount of Shimming Delay

        Minimizing the required shimming delay in the computation graph is important in
        the isomorphic implementation approach based on bit-serial arithmetic, which will
        be discussed in Chapter 9. The main reason is that the shimming delays in bit-serial
        arithmetic represent moving signal values (shift registers) that consume significant
        amounts of power. The amount of shimming delay is of less importance in bit-paral-
        lel implementations since data can be stored for arbitrarily long times in static regis-
        ters, i.e., the power consumption does not depend on how long the values are stored.
            The amount of shimming delay can be minimized by using Theorem 6.4,
        which is based on the fact that the ordering of additions (subtractions) is irrele-
        vant if two's-complement representation is used and if the final sum is properly
        scaled, as was discussed in Chapter 5.

            Theorem 6.4
            For a multi-input
            adder    (subtracter)
            that has inputs that
            arrive at times t\ <
            t2 <••• < t-tf the mini-
            mum latency and the
            minimum amount of
            shimming delay are
            attained  if, recur-
            sively, the two earliest
            inputs or a previously
            computed partial sum
            are added (subtracted)
            using two's-complement
            representation, as illus-
            trated in Figure 6.38.  Figure 6.38 Adder tree with minimum latency and
                                             amount of shimming delay
            The required amount
        ol smmmmg delay can olten be reduced by simple cnanges to me computation
        graph that do not affect the numerical properties of the algorithm [5]—for exam-
        ple, reordering the additions.


        6.7.4 Maximally Fast Critical Loops
        In many cases, the order of arithmetic operations can be changed without affecting
        the numerical properties of the algorithm. Generally, the associative, commutative,
   261   262   263   264   265   266   267   268   269   270   271