Page 64 - Hardware Implementation of Finite-Field Arithmetic

P. 64

mod m Reduction 47

relative value of n and k : y and c are (n − k + 1)-bit numbers, q is a
(k + 2)-bit number and m a k-bit number. Thus,

n = max{n − k + 1, k + 2} and n = max{n − k + 1, k}
1 2
The minimum period of the clock signal is the delay of an n -bit by
1
n -bit multiplier whose only n + 3 outputs (less significant bits) are
2
used. If a carry-save multiplier is used, the corresponding computation
time is about (n + 3)T . According to Eqs. (2.41) and (2.49), r’ < 3m, so
MULT
that the final result is either r’, r’ − m, or r’ − 2m. Thus, the whole
computation is performed in at most 5 cycles, so that
T ≈ 5(n + 3)T (2.51)
MULT
A VHDL file Barrett_reducer.vhd is available at www.arithmetic-
circuits.org. The corresponding entity declaration is
entity Barrett_reducer is
port (
x: in std_logic_vector (N-1 downto 0);
m: in std_logic_vector(K-1 downto 0);
clk, reset, start: in std_logic;
z: out std_logic_vector (K-1 downto 0);
done: out std_logic
);
end Barrett_reducer;

If n > 2k, and thus n = n = n − k + 1, the VHDL architecture
1 2
corresponding to the circuit of Fig. 2.9 is the following:
with sel_mul select mul1 <= x(N-1 downto K-1) when ‘0’,
(ZERO & q) when others;
with sel_mul select mul2 <= c when ‘0’, (“00” & ZERO & m)
when others;
mul_out <= mul1 * mul2;
register_prod: process(clk)
begin
if clk’event and clk=‘1’ then
if ce_prod = ‘1’ then prod <= mul_out(n+2 downto 0); end if;
end if;
end process;
q <= prod(n+2 downto n-k+1);
with sel_sub select sub1 <= x(k+1 downto 0) when ‘0’, r when
others;
with sel_sub select sub2 <= prod(k+1 downto 0) when ‘0’,
(“00”&m) when others;
dif <= (‘0’ & sub1) - (‘0’ & sub2);
sign <= dif(k+2);
register_r: process(clk)
begin
if clk’event and clk=‘1’ then

59 60 61 62 63 64 65 66 67 68 69