Page 105 - Hardware Implementation of Finite-Field Arithmetic
P. 105
88 Cha pte r T h ree
64
192
For m = 2 – 2 − 1 (k = 192), four circuits have been implemented:
csa_mod_multiplier, dar_mod_multiplier, dar_csa_multiplier, Montgomery_
multiplier (remember that the Montgomery multiplier does not
compute xy mod m but xy2 mod m). The cost and delay of several
−k
multipliers are shown in Table 3.3.
FF LUTs Slices Period Cycles Total time
csa_mod 1,271 3,678 2,053 6.233 384 2393.5
dar_mod 400 593 400 23.615 384 9068.2
dar_csa 597 1,835 1,113 9.796 384 3761.7
Montgomery 612 1,398 922 6.765 198 1339.5
TABLE 3.3 Cost and Delay of mod 2 192 – 2 64 − 1 Multipliers
64
192
Another mod 2 –2 − 1 multiplier implementation is reported
in App. A (Sec. A.4.1). It uses a sequential multiplier and the
combinational reducer of Sec. 2.6.2.
3.6.3 mod m Exponentiators
192
64
Two values of m are considered: m = 239 (k = 8) and m = 2 − 2 − 1
(k = 192). The implementation results are the following (Tables 3.4
and 3.5):
FF LUTs Slices Period Cycles Total time
MSB-first 70 166 93 6.960 128 891
LSB-first 97 265 140 7.533 64 482
TABLE 3.4 Cost and Delay of mod 239 Exponentiators
FF LUTs Slices Period Cycles Total time
MSB-first 1,185 1,993 1,199 8.176 73,733 602,841
LSB-first 1,779 3,554 1,983 8.871 36,869 327,065
TABLE 3.5 Cost and Delay of mod 2 192 – 2 64 − 1 Exponentiators
3.7 Comments and Conclusions
The experimental results do not completely confirm the theoretical
results of Table 3.1. The fastest multiplier is obtained with the csa_
mod_multiplier entity and not with the dar_csa_multiplier entity. On the
other hand the latter uses less slices. As regards the exponentiation,
the fastest circuit is obtained with the LSB-first algorithm, and the
most cost-effective with the MSB-first algorithm.