Page 496 - DSP Integrated Circuits
P. 496
11.6 Bit-Serial Arithmetic 481
simplifies the wiring and the layout. The Baugh-Wooley's multiplier algorithm
can also be used to realize serial/parallel multipliers. The power consumption is of
3
the order O(W d ).
11.5.7 Look-Up Table Techniques
Multiplication can also be done by using look-up tables. A straightforward imple-
w +w
mentation would require a table with 2 d c words. A ROM of this size would
require too much chip area and also be too slow for the typical word length used in
digital filters. A more efficient approach is based on the identity
The multiplication can be carried out by only one addition, two subtractions,
and two table look-up operations. The size of the lookup table, which stores
w
squares, is reduced to only 2 d words. Hence, this technique can be used for num-
bers with small word lengths—for example, up to eight to nine bits.
11.6 BIT-SERIAL ARITHMETIC
Bit-serial arithmetic is a viable alternative in digital signal processing applica-
tions to traditional, bit-parallel arithmetic. A major advantage of bit-serial over
bit-parallel arithmetic is that it significantly reduces chip area. This is done in two
ways. First, it eliminates wide buses and simplifies wire routing. Second, by using
small processing elements, the chip itself will be smaller and require shorter wir-
ing. A small chip can support higher clock frequencies and is therefore faster.
Two's-complement and binary offset representations are suitable for DSP algo-
rithms implemented with bit-serial arithmetic, since the bit-serial operations can
be done without knowing the sign of the numbers involved. Since two's-comple-
ment and binary offset representation use similar algorithms, we will discuss only
bit-serial arithmetic with two's-complement representation.
A major issue in the design of the building blocks is to decide how data shall
be transmitted and processed. There are two basic possibilities: bit-serial or bit-
parallel transmission. Usually, this choice also governs how operations are per-
formed. When making this choice, it is easy to draw the wrong conclusion given
the following argument: "Since bit-serial arithmetic processes only one bit at each
time instance, bit-parallel arithmetic must be about W^ times faster, assuming the
data word length is Wj." In reality, the ratio in speed will be much smaller due to
the long carry propagation paths present in parallel arithmetic. Furthermore, bit-
parallel arithmetic uses more than W^ times as large a chip area as does bit-serial
arithmetic. In fact, the computational throughput per unit chip area is higher
than for parallel arithmetic. On the whole, bit-serial arithmetic is often superior.
The issue of comparing power consumption is, however, more complicated.
Bit-parallel arithmetic suffers from energy losses in glitches that occur when the
carry propagates, but the glitches will be few if successive data are strongly corre-
lated. Driving long and wide buses consumes large amounts of power. Bit-serial
arithmetic, on the other hand, will only perform useful computations without any
glitches, but require more clocked elements that will consume significant amounts

