Page 374 - DSP Integrated Circuits
P. 374

8.3 Standard DSP Architectures                                       359


        8.3 STANDARD DSP ARCHITECTURES

        Current computer architectures are commonly classified as RISCs (reduced
        instruction-set computers) and CISCs (complex instruction-set computers). The
        latter has a large number of powerful instructions while a RISC processor has
        fewer instructions and is defined as performing a register-to-register move in one
        machine cycle. RISC computers are today beginning to replace CISCs, because
        they achieve higher throughput by using highly pipelined execution of simple
        instructions and efficient compilers. The time and cost for developing RISCs are
        lower than for CISCs. Hence, time between two RISC generations can be made
        shorter and the latest VLSI technology can be used to achieve high performance.
            Standard DSP processors have many RISC-like features, but they are special-
        purpose processors whose architecture is designed to operate in a computationally
        demanding, real-time environment. RISCs, on the other hand, are general-pur-
        pose processors, even though they are beginning to find their way into some
        embedded applications. A standard DSP processor executes several operations in
        parallel while a RISC uses heavily pipelined functional units that can initiate and
        complete a simple instruction in one or two clock cycles. Note, however, that the
        latency of a RISC instruction (for example, a floating-point addition) may be much
        longer and that the maximal iteration period is bounded by the latency. In practice
        it may be difficult to make efficient use of long pipelines.
            Standard DSP processors are generally characterized by the following archi-
        tectural features:

             1. A fast on-chip multiplier that can perform multiply-and-accumulate
                type operations in one instruction cycle. An instruction cycle is generally
                one or two clock cycles long. Both fixed-point and floating-point
                arithmetic DSPs are available.
             2. Several functional units that perform several parallel operations,
               including memory accesses and address calculations. The functional units
               typically include the main (ALU) together with two or more address
               generation units. The functional units have usually their own set of
               registers and most instructions execute in a single instruction cycle.
             3. Several large on-chip memory units (typically two to three) used to store
               instructions, data, or look-up tables. Each memory unit can be accessed
                once every instruction cycle. Large numbers of temporary registers are
               used to store values used for long periods of time. Many modern DSPs
               have some form of instruction cache, but they do not have data caches
               and do not support virtual memory.
             4. Several on-chip system buses to increase memory transfer rate and avoid
               addressing conflicts.
             5. Support for special addressing modes, especially modulo and bit-reversed
               addressing needed in the FFT. Modulo addressing allows for fast circular
               buffers that reduce the overhead in recursive algorithms. Dedicated
               hardware for address calculations.
             6. Support for low-overhead looping and fast interrupt handling, especially
               that arising from arithmetic or I/O operations. On-chip serial ports.
             7. Standby power-saving modes during which only the peripherals and the
               memory are active.
   369   370   371   372   373   374   375   376   377   378   379