Page 522 - DSP Integrated Circuits
P. 522
11.15 The Basic Shift-Accumulator 507
Figure 11.41 Linear-phase FIR filter with N = 12
11.14.2 Parallel Implementation of Distributed Arithmetic
Distributed arithmetic can, of course, be implemented in parallel form—i.e., by
allocating a ROM to each of the terms in Equation (11.47). The ROMs, which are
identical, can be addressed in parallel and their values, appropriately shifted,
added using an adder tree as illustrated in Figure 11.42. The critical path is
through a ROM and through the adder tree. The critical path can be broken into
small pieces to achieve very high speed by pipelining.
11.15 THE BASIC SHIFT-ACCUMULATOR
The shift-accumulator shall perform a shift-and-add operation and a subtraction
in the last time slot. Obviously, for typical word lengths in ROM of 8 to 18 bits, a
ripple-through adder or a carry-look-ahead adder is unsuitable for speed and
complexity reasons. The shift-accumulator, shown in Figure 11.43, uses carry-
save adders instead. This yields a regular hardware structure, with short delay
paths between the clocking elements. Furthermore, the shift-accumulator can be
implemented using a modular (bit-slice) design. The number of bits in the shift-
accumulator can be chosen freely.
In the first time slot word Fyfd-l from the ROM shall be added to the initially
cleared accumulator. In the next time slot -^Wd-2 shall be added to the previous

