Page 204 - Hardware Implementation of Finite-Field Arithmetic
P. 204
184 Cha pte r Se v e n
by x in step 5. The overall operation count in each algorithmic round
is two additions, two multiplications, and one division by x in the
worst case (a = c = 1).
i 0
If the function
function lshift(x: poly_vector) return poly_vector
computing the 1-bit left shift of a given binary polynomial is available,
then the bit-level algorithm for Montgomery multiplication over
m
GF(2 ) given in Algorithm 7.7 can be rewritten in the following
algorithm where the function lshift is used:
Algorithm 7.8—Bit-level Montgomery multiplication
for i in 0 .. m-1 loop c(i) := 0; end loop;
for i in 0 .. m-1 loop
c := m2xvv(c,m2abv(a(i),b));
if c(0) = 1 then
c := m2xvv(c,m2abv(c(0),f));
c := lshift(c);
c(m-1) := 1;
else
c := lshift(c);
end if;
end loop;
An executable Ada file bmult_montgomery.adb, including
Algorithm 7.8, is available at www.arithmetic-circuits.org. It must be
noted that the use of a left-shift function for the implementation of
the division by x in Algorithm 7.8 is because the binary coefficients of
the polynomials defined by poly_vector are in order 0 . . . m – 1 (left-to-
right). Furthermore, the assignment c(m − 1) := 1 for c(0) = 1 is because
f(x) is a polynomial with m + 1 coefficients, while c(x) has m coefficients;
for the addition c(x): = c(x) + c f(x), the above assignment represents
0
m
the addition of the term x from f(x).
A slightly different version of the Algorithm 7.8 is also available
at www.arithmetic-circuits.org. This second version is given in the
executable Ada file bmult_montgomery_v2.adb, which rewrites the
Algorithm 7.8 as follows:
for i in 0 .. m-1 loop
c := m2xvv(c,m2abv(a(i),b));
prev_c0 := c(0);
c := m2xvv(c,m2abv(c(0),f));
c := lshift(c);
c(m-1) := prev_c0;
end loop;
A VHDL file montgomery_mult.vhd, modeling a sequential
implementation of the above second version of Algorithm 7.8, is
available at www.arithmetic-circuits.org. The corresponding datapath