Page 59 - Hardware Implementation of Finite-Field Arithmetic
P. 59

42    Cha pte r  T w o


               delay is equal to kT  + (1 + d − k)T . An upper bound is (d + 1)T  ,
                               MULT          FA                      MULT
               and an upper bound of d can be calculated as follows:
                X   b   + X  b   +  . . .  + X b  + X b  < 2 ( b   + b   +  . . .  + b  + b )
                                                  k
                  s − 1 s − 1  s − 2 s − 2  1 1  0 0  s − 1  s − 2  1  0
                                     < s2  = (n/k)2 2k
                                         2k
               so that an upper bound of d is 2k + log n − log k. Assuming that the
                                                2     2
               inner loop of Algorithm 2.6 is executed only once (best case), the total
               computation time is about
                         T ≈ s(2k + log n − log k + 1)T   ≈ 2nT      (2.36)
                                    2     2     MULT    MULT
                  A VHDL file precomputation_reducer.vhd is available at www.
               arithmetic-circuits.org. The corresponding entity declaration is
               entity precomputation_reducer is
               port (
                 x: in std_logic_vector (N-1 downto 0);
                 m: in std_logic_vector(K-1 downto 0);
                 clk, reset, start: in std_logic;
                 z: out std_logic_vector (K-1 downto 0);
                 done: out std_logic
               );
               end precomputation_reducer;

                  The VHDL architecture corresponding to the circuit of Fig. 2.7 is
               the following:

               digit_selection: for i in 0 to s-1 generate
                   vector_r(i) <= r((i+1)*K-1 downto i*K);
               end generate;
               vector_i <= vector_r(conv_integer(sel));
               b_i <= b_table(conv_integer(sel));
               product <= vector_i * b_i;
               next_acc <= acc+product;
               register_acc: process(reset, clk)
               begin
                 if reset = ‘1’ then acc <= (others => ‘0’);
                 elsif clk’event and clk = ‘1’ then
                   if load = ‘1’ or reload = ‘1’ then acc <= (others => ‘0’);
                   elsif ce_acc = ‘1’ then acc <= next_acc;
                   end if;
                 end if;
               end process register_acc;
               register_r: process(reset, clk)
               begin
                 if reset = ‘1’ then r <= (others => ‘0’);
                 elsif clk’event and clk = ‘1’ then
                   if load = ‘1’ then r <= x;
                   elsif reload = ‘1’ then r <= ZERO & acc;
                   end if;
                 end if;
   54   55   56   57   58   59   60   61   62   63   64