Page 48 - Hardware Implementation of Finite-Field Arithmetic
P. 48
mod m Reduction 31
be the 4-bit naturals corresponding to the four most significant bits of
s ’ and s ’, and compute t = s ’ + s ’ mod 16. Then the value of q can be
s c s c
defined as a function of t: q = 1 if t ≤ 2, q = − 1 if 11 ≤ t ≤ 14, q = 0 if
t = 15, and t never belongs to the interval 3 ≤ t ≤ 10.
Algorithm 2.3 —Computation of q-SRT algorithm with stored-carry encoding
function quotient(ss, sc, y: in natural) return integer is
ss_high, sc_high, t: natural;
begin
ss_high := ss / (2**(n-1));
sc_high := sc / (2**(n-1));
t := (ss_high+sc_high) mod 16;
if t <= 2 then return 1;
elsif t < 15 then return -1;
else return 0;
end if;
end quotient;
An executable Ada file srt_reducer.adb including Algorithm 2.2 as
well as the functions csa_sum, csa_carry, and quotient (Algorithm 2.3)
is available at www.arithmetic-circuits.org.
The datapath corresponding to Algorithm 2.2, with the quotient
function defined by Algorithm 2.3, is shown in Fig. 2.4. The Boolean
functions defining q (2’s complement) are
q = t ⊕ tt t q = tt t (2.18)
21 0
3
1
21 0
0
The minimum period of the clock signal is equal to the delay of a
4-bit adder, plus the computation time of a 4-input Boolean function,
plus the delay of a 1-bit adder, that is, 5T + T . Assuming that T
FA Boolean4 FA
and T have the same order of magnitude, the minimum clock
Boolean4
period is approximately equal to
T ≈ 6T (2.19)
CLK FA
A lower bound of the total computation time, including the final
steps (decoding from stored-carry form to normal form and correction
if z < 0), is given by
T = (n − k + 1)T + (k + 1)T ≈ 6(n − k + 1)T + (k + 1)T (2.20)
CLK FA FA FA
that is a linear function of n and k instead of a quadratic function, as
in the case of Algorithm 2.1 [Eq. (2.14)]. In Eq. (2.20) the term (k + 1)T
FA
corresponds to the final operations; they are not executable in one
clock cycle.