Page 512 - DSP Integrated Circuits
P. 512

11.10 Bit-Serial Squarers                                            497














            In the next step, /g = f(x-x\) is decomposed in the same manner into the
         square of the most significant bit of x-xi with rest term and a remaining square/^.
         The scheme is repeated as long as there are bits left to process in the remaining
         square— i.e., until the square f n is reached.
            Examining this scheme we find that in order to input a bit-serial word x with
        the least significant bit first, we have to reverse the order of the iterations in the
         preceding scheme to have a suitable algorithm.
            The iterative algorithm then can be written as



        where










            In each iteration j we accumulate the previous term/y +i and input the next bit
        Xj. If Xj = I then we add the square of the bit weight and the weights of the bits
        that have arrived prior to bit Xj shifted left 1-j positions. Then we store bit Xj for
        the next iteration. Examination of the bit weights accumulated in each iteration
        reveals that the sum converges toward the correct square with at least one bit in
        each step, going from the least significant bit in the result toward more significant
        bits.
            An implementation of the preceding algorithm is shown in Figure 11.35. It
        uses a shift-accumulator to shift the accumulated sum to the right after each iter-
        ation. Thus, left-shifting of the stored xfs are avoided and the addition of the
        squared bit weight of the incoming Xj is reduced to a shift to the left in each itera-
        tion. The implementation consists of n regular bit-slices which makes it suitable
        for hardware implementation.
            The operation of the squarer in Figure 11.35 is as follows: All D flip-flops and
        SR flip-flops are assumed to be cleared before the first iteration. In the first itera-
        tion, the control signal c(0) is high while the remaining control signals are low. This
        allows the first bit #(0) = x n to pass the AND gate on top of the rightmost bit-slice.
        The value x n is then stored in the SR flip-flop of the same bit-slice for later use. Also,
        x n is added to the accumulated sum via the OR gate in the same bit-slice, 3




        3
         - An OR gate is sufficient to perform the addition.
   507   508   509   510   511   512   513   514   515   516   517