Page 204 - Hardware Implementation of Finite-Field Arithmetic
P. 204

184    Cha pte r  Se v e n


               by x in step 5. The overall operation count in each algorithmic round
               is two additions, two multiplications, and one division by x in the
               worst case (a  = c  = 1).
                          i  0
                  If the function
                  function lshift(x: poly_vector) return poly_vector
               computing the 1-bit left shift of a given binary polynomial is available,
               then the bit-level algorithm for Montgomery multiplication over
                   m
               GF(2 )  given in  Algorithm 7.7 can be rewritten in the following
               algorithm where the function lshift is used:
               Algorithm 7.8—Bit-level Montgomery multiplication
               for i in 0 .. m-1 loop c(i) := 0; end loop;
               for i in 0 .. m-1 loop
                 c := m2xvv(c,m2abv(a(i),b));
                 if c(0) = 1 then
                   c := m2xvv(c,m2abv(c(0),f));
                   c := lshift(c);
                   c(m-1) := 1;
                 else
                   c := lshift(c);
                 end if;
               end loop;
                  An executable  Ada file bmult_montgomery.adb, including
               Algorithm 7.8, is available at www.arithmetic-circuits.org. It must be
               noted that the use of a left-shift function for the implementation of
               the division by x in Algorithm 7.8 is because the binary coefficients of
               the polynomials defined by poly_vector are in order 0 . . . m – 1 (left-to-
               right). Furthermore, the assignment c(m − 1) := 1 for c(0) = 1 is because
               f(x) is a polynomial with m + 1 coefficients, while c(x) has m coefficients;
               for the addition c(x): = c(x) + c   f(x), the above assignment represents
                                        0
                                    m
               the addition of the term x from f(x).
                  A slightly different version of the Algorithm 7.8 is also available
               at www.arithmetic-circuits.org. This second version is given in the
               executable Ada file bmult_montgomery_v2.adb, which rewrites the
               Algorithm 7.8 as follows:

               for i in 0 .. m-1 loop
                 c := m2xvv(c,m2abv(a(i),b));
                 prev_c0 := c(0);
                 c := m2xvv(c,m2abv(c(0),f));
                 c := lshift(c);
                 c(m-1) := prev_c0;
               end loop;

                  A VHDL file montgomery_mult.vhd, modeling a sequential
               implementation of the above second version of  Algorithm 7.8, is
               available at www.arithmetic-circuits.org. The corresponding datapath
   199   200   201   202   203   204   205   206   207   208   209