Page 374 - DSP Integrated Circuits
P. 374
8.3 Standard DSP Architectures 359
8.3 STANDARD DSP ARCHITECTURES
Current computer architectures are commonly classified as RISCs (reduced
instruction-set computers) and CISCs (complex instruction-set computers). The
latter has a large number of powerful instructions while a RISC processor has
fewer instructions and is defined as performing a register-to-register move in one
machine cycle. RISC computers are today beginning to replace CISCs, because
they achieve higher throughput by using highly pipelined execution of simple
instructions and efficient compilers. The time and cost for developing RISCs are
lower than for CISCs. Hence, time between two RISC generations can be made
shorter and the latest VLSI technology can be used to achieve high performance.
Standard DSP processors have many RISC-like features, but they are special-
purpose processors whose architecture is designed to operate in a computationally
demanding, real-time environment. RISCs, on the other hand, are general-pur-
pose processors, even though they are beginning to find their way into some
embedded applications. A standard DSP processor executes several operations in
parallel while a RISC uses heavily pipelined functional units that can initiate and
complete a simple instruction in one or two clock cycles. Note, however, that the
latency of a RISC instruction (for example, a floating-point addition) may be much
longer and that the maximal iteration period is bounded by the latency. In practice
it may be difficult to make efficient use of long pipelines.
Standard DSP processors are generally characterized by the following archi-
tectural features:
1. A fast on-chip multiplier that can perform multiply-and-accumulate
type operations in one instruction cycle. An instruction cycle is generally
one or two clock cycles long. Both fixed-point and floating-point
arithmetic DSPs are available.
2. Several functional units that perform several parallel operations,
including memory accesses and address calculations. The functional units
typically include the main (ALU) together with two or more address
generation units. The functional units have usually their own set of
registers and most instructions execute in a single instruction cycle.
3. Several large on-chip memory units (typically two to three) used to store
instructions, data, or look-up tables. Each memory unit can be accessed
once every instruction cycle. Large numbers of temporary registers are
used to store values used for long periods of time. Many modern DSPs
have some form of instruction cache, but they do not have data caches
and do not support virtual memory.
4. Several on-chip system buses to increase memory transfer rate and avoid
addressing conflicts.
5. Support for special addressing modes, especially modulo and bit-reversed
addressing needed in the FFT. Modulo addressing allows for fast circular
buffers that reduce the overhead in recursive algorithms. Dedicated
hardware for address calculations.
6. Support for low-overhead looping and fast interrupt handling, especially
that arising from arithmetic or I/O operations. On-chip serial ports.
7. Standby power-saving modes during which only the peripherals and the
memory are active.