Page 496 - DSP Integrated Circuits
P. 496

11.6 Bit-Serial Arithmetic                                           481

        simplifies the wiring and the layout. The Baugh-Wooley's multiplier algorithm
        can also be used to realize serial/parallel multipliers. The power consumption is of
                      3
        the order O(W d ).

        11.5.7 Look-Up Table Techniques
        Multiplication can also be done by using look-up tables. A straightforward imple-
                                            w +w
        mentation would require a table with 2 d c words. A ROM of this size would
        require too much chip area and also be too slow for the typical word length used in
        digital filters. A more efficient approach is based on the identity





            The multiplication can be carried out by only one addition, two subtractions,
        and two table look-up operations. The size of the lookup table, which stores
                                 w
        squares, is reduced to only 2 d words. Hence, this technique can be used for num-
        bers with small word lengths—for example, up to eight to nine bits.


        11.6 BIT-SERIAL ARITHMETIC

        Bit-serial arithmetic is a viable alternative in digital signal processing applica-
        tions to traditional, bit-parallel arithmetic. A major advantage of bit-serial over
        bit-parallel arithmetic is that it significantly reduces chip area. This is done in two
        ways. First, it eliminates wide buses and simplifies wire routing. Second, by using
        small processing elements, the chip itself will be smaller and require shorter wir-
        ing. A small chip can support higher clock frequencies and is therefore faster.
        Two's-complement and binary offset representations are suitable for DSP algo-
        rithms implemented with bit-serial arithmetic, since the bit-serial operations can
        be done without knowing the sign of the numbers involved. Since two's-comple-
        ment and binary offset representation use similar algorithms, we will discuss only
        bit-serial arithmetic with two's-complement representation.
            A major issue in the design of the building blocks is to decide how data shall
        be transmitted and processed. There are two basic possibilities: bit-serial or bit-
        parallel transmission. Usually, this choice also governs how operations are per-
        formed. When making this choice, it is easy to draw the wrong conclusion given
        the following argument: "Since bit-serial arithmetic processes only one bit at each
        time instance, bit-parallel arithmetic must be about W^ times faster, assuming the
        data word length is Wj." In reality, the ratio in speed will be much smaller due to
        the long carry propagation paths present in parallel arithmetic. Furthermore, bit-
        parallel arithmetic uses more than W^ times as large a chip area as does bit-serial
        arithmetic. In fact, the computational throughput per unit chip area is higher
        than for parallel arithmetic. On the whole, bit-serial arithmetic is often superior.
            The issue of comparing power consumption is, however, more complicated.
        Bit-parallel arithmetic suffers from energy losses in glitches that occur when the
        carry propagates, but the glitches will be few if successive data are strongly corre-
        lated. Driving long and wide buses consumes large amounts of power. Bit-serial
        arithmetic, on the other hand, will only perform useful computations without any
        glitches, but require more clocked elements that will consume significant amounts
   491   492   493   494   495   496   497   498   499   500   501