Page 363 - DSP Integrated Circuits
P. 363
348 Chapter 7 DSP System Design
7.12 DCT PROCESSOR, CONT.
A 2-D DCT system should be able to process an HDTV image of 1920 x 1080 pixels
in real time in order to satisfy the most stringent requirement in Table 3.3. The
image frequency is 60 Hz. The image is partitioned into subframes of 16 x 16 pix-
els which are processed in the 2-D DCT system. Hence, the requirement is
The line memory 1920 pixels
stores 16 complete
lines of image and sup- 6
16 lines J^ __ 2 _ DDCT _ Line
ports the DCT chip Memory ^ ^ ^^ L Memory
with data, as illus-
trated in Figure 7.93.
Figure 7.93 Environment for the 2-D DCT
The processing element
in the 2-D DCT system
is a 1-D DCT bit-serial PE. The PE can compute one 16-point DCT in W d = 12 clock
cycles = 1 time unit. The feasible clock frequency is at least 220 MHz. As discussed
in Chapter 3, the computation of a 2-D DCT requires 32 time units. Two additional
time units are lost since the pipelined PEs must be emptied between the row and
column computations. Thus, one such PE can compute
Obviously, this is enough, but we still select to use two 2-D DCT PEs. The rea-
son is the reduction in speed requirement on both the PEs and the RAMs. Further,
the use of two PEs facilitates a more regular architecture. The pixels have a reso-
lution of 8 to 9 bits while internal computations in the 2-D DCT use 12 bits, includ-
ing the guard bit.
For testing purposes we will use standard FIFOs, which are available in orga-
n
nizations of 2 x 9-bits words (Am7204-25 with a cycle time of 35 ns), as line mem-
ories. Allowing 15 ns for I/O, the FIFOs can be clocked at 20 MHz. The FIFOs will
directly feed data into the PEs. The required number of FIFOs is
We choose to use eight FIFOs, each of 4096 words. The total amount of memory is
of which we use only
The use of FIFOs simplifies communication with the surrounding circuitry.
However, it is not necessary to use FIFOs. RAMs can also be used. The DCT PEs
receive data from the FIFOs and produce intermediate data during the first 16
time units. Two time units are lost in the pipeline, then in the last 16 time units
the output is produced and directed to the output port. Consequently, by interleav-