Page 387 - DSP Integrated Circuits
P. 387

372                                              Chapter 8 DSP Architectures


        8.6.1 Interconnection Topologies
        To exploit parallelism efficiently, a parallel or distributed system must be designed
        to minimize communication overhead between processors. A given communication
        strategy might support one application well but be inefficient for others. A com-
        plete interconnection between all processors (for example, using the crossbar
        shown in Figure 8.16) might be cost-prohibitive while a shared single-bus inter-
        connection, as shown in Figure 8.17, might be inefficient. Hence, most applications
        call for a solution whose cost and performance lie somewhere between the two
        extremes. The interconnection network must be efficient, reliable, and cost-effec-
        tive. Topologies that can accommodate an arbitrary number of nodes are called
        scalable architectures.
            Many different interconnection networks have been proposed [1, 3, 9]. Some
        common examples are shown in Figures 8.18 through 8.21. Mesh-connected
        processor arrays have simple and regular interconnections and are therefore
        preferred architectures for special-purpose VLSI designs [7].
            Special, single-chip, 32-bit processors called Transputers™ have been intro-
        duced by Inmos for use in such networks [11]. The transputer, which is aimed at
        multicomputer applications, has four high-speed communication ports. Hence, it is
        well suited for square mesh and Boolean 4-cube topologies. However, the Trans-
        puter is a general RISC processor and is therefore not optimized for DSP applica-
        tions. Some of the more recent standard DSPs are provided with I/O ports to
        support multicomputer applications.
            A variation of a Boolean cube is the cube connected cycles (CCC) where each
        node in the Boolean cube consists of a number of PE-memory pairs where each PE-
        memory pair needs only three ports.
            Multistage interconnection networks (MICN) and multiple-bus topologies are
        also commonly used [2, 7]. The omega network, shown in Figure 8.20, has multiple
        shuffle exchange stages of switches. The source processor generates a tag that is the
        binary representation of the destination. The connection switches in the ith stage



























            Figure 8.16 Crossbar architecture    Figure 8.17 A single-bus structure
   382   383   384   385   386   387   388   389   390   391   392