Page 392 - DSP Integrated Circuits
P. 392

8.8 Wave Front Arrays                                                377




































                                Figure 8.23 Wave front array




            Systolic and wave front array implementations of FIR and IIR niters achieve
        a maximum throughput rate of one output per multiply-accumulate cycle. For
        example, lattice wave digital niters with branches implemented using Richards
        structures can be implemented efficiently using systolic and wave front arrays.
        They use, at most, twice as many PEs as the filter order.


        8.8.1 Datawave™
        ITT Intermetal has developed a single-chip multiple-instruction multiple-data
        (MIMD) processor, Datawave [19], for video applications. Figure 8.24 shows the
        architecture. The PEs are organized in a 4 x 4 rectangular array and the ICN pro-
        vides nearest neighbor communication. Several chips can be combined to synthe-
        size larger arrays. The chip has a data word length of only 12 bits, which is
        sufficient for most video applications. Communication between PEs, as well as
        with the outside, takes place via two 12-bit wide buses and requires only one clock
        cycle, except for interchip communication, which is somewhat slower.
            Communication between PEs is data-driven. Operations in the algorithm are
        therefore statically scheduled at compile time, although only the order of opera-
        tions is scheduled, not the timing. FIFOs have been inserted between the PEs to
        improve performance by averaging out variations in work load between the PEs.
        Self-timed handshake protocols are used to synchronize the asynchronous data
   387   388   389   390   391   392   393   394   395   396   397