Page 122 - DSP Integrated Circuits
P. 122
3.20 DOT Processor—Case Study 2 107
3.20.1 Specification
To meet these stringent requirements, the processor should be able to compute
486,000 two-dimensional (16 x 16) DCTs/s. Typically, input data word lengths are
in the range of 8 to 9 bits and the required internal data word length is estimated
to be 10 to 12 bits. The coefficient word length is estimated to be 8 to 10 bits.
3.20.2 System Design Phase
The 2-D DCT is implemented by
successively performing 1-D
DCTs, as discussed in Example
3.9 where the computation was
split into two phases. The algo-
rithm is illustrated in Figure 3.33.
In the first phase, the input
array with 16 rows is read row-
wise into the processing ele-
ment, which computes a 1-D
DCT, and the results are stored Fi ^ re 3 33
' Realization of the 2-D DCT
in the RAM. In the second
phase, the columns in the inter-
mediate array are successively computed by the processing element. This phase
also involves 16 1-D DCT computations. Theoretically, a transposition of the inter-
mediate array is required, but in practice it can be realized by appropriately writ-
ing and reading the intermediate values. See Problem 3.32. The algorithm is
described by the pseudocode shown in Box 3.7.
Program Two_D_DCT;
var
begin
{ Read input data, row-wise }
for n := 1 to 16 do
DCT_of_Row(n);
Transpose_Intermediate_Array;
for n := 1 to 16 do
DCT_of_Row(n);
{Write output, row-wise }
end.
Box 3.7. Pseudo-code for the two-dimensional DCT
The required number of arithmetic operations per second in this algorithm is
determined as follows: A direct computation of the 1-D DCT involves 16 multiplica-
tions and 15 additions. If the symmetry and antisymmetry in the basis vectors were
exploited, the number of multiplications would be reduced to 8, but the number of