Page 379 - DSP Integrated Circuits
P. 379
364 Chapter 8 DSP Architectures
RAM and 64k x 24-bit program memory. The DSP56002 is available in a 132-pin
pin—grid array package or plastic-quad flat-pack (PQFP).
The Motorola 56002 has three functional units: (a) multiply-accumulate unit,
(b) address generation unit, and (c) program control unit. This architecture makes
it possible to perform two memory operations concurrently: a multiplication and an
addition. The address generation unit is responsible for calculating the different
addresses and contains its own set of registers (the address registers of the DSP).
The ALUs used for address calculations can only perform the simple addition oper-
ations, but can also support modulo and bit-reversed address calculations. The pro-
gram control unit is responsible for fetching and decoding instructions, while at the
same time containing different control-registers such as the program counter, the
stack pointer, and the status register. The program control unit also contains the
hardware to support low-overhead looping and interrupt handling.
1
Some DSP chips use a VLIW with separate fields in the instruction control
word to control the different units. For example, the Motorola DSP56001 and
DSP56002 perform an instruction prefetch using a three-stage instruction pipe-
line, a 24 x 24-bit multiply, a 56-bit addition, two data moves, and two address
pointer updates in a single instruction cycle. The DSP56002 can perform up to six
operations simultaneously, which corresponds to a peak performance of 120 MOPS
@ 40 MHz. Typical power consumption is less than 0.5 W.
Typical performance of the DSP56002 is that an JV-tap FIR filter requires N+7
clock cycles while N second-order sections in direct form II require 5N+1 clock
cycles. A 91-tap FIR filter can be executed in 4.9 us and a sixth-order filter in 0.8
us @ 40 MHz. A 256-point and a 1024-point FFT, without bit-reversal, take 0.78
and 3.89 ms, respectively.
The DSP56002 is an enhanced, software-compatible, version of the DSP56001.
The current versions have clock frequencies of 40 to 80 MHz. A low-power version,
DSP56L002, that operates with a power supply voltage of 3.3 V is also available.
The main new features, compared to the DSP56001, are an on-chip emulator, a
phase-locked loop (PLL) clock oscillator, improved external memory interface, and
some new instructions. The new instructions include INC (increment) and DEC
(decrement), double-precision multiply, and debug instructions. The latter makes
it possible to examine registers, memory, and on-chip peripherals during program
development.
8.3.7 Motorola DSP96001™ and DSP96002™
The DSP96001 is a floating-point version of the DSP56001 aimed at multimedia
applications. It can be used as a stand-alone device or in multiprocessor configura-
tions. The clock frequency of the current version is 40 MHz and the instruction
cycle time 50 ns with a 40-MFLOP peak performance. The Motorola DSP96001 has
three functional units operating in parallel: (a) IEEE 32-bit floating-point multiply-
and-accumulate unit, (b) address generation unit, and (c) program control unit. In
addition, two channels of DMA, six on-chip memories, and various external inter-
1
- VLIW (Very Long Instruction Word): A parallel architecture that uses multiple, independent
functional units and packages multiple operations into one very long instruction. The
instruction has a field for each PE.