Page 292 - ARM 64 Bit Assembly Language

P. 292

Non-integral mathematics 281

59 bl fixed_cos
60 mov x1,#28
61 ldr x2,=fmta
62 bl printS
63
64 ldr x0,=newline
65 bl printf
66
67 add x27,x27,#1
68 cmp x27,#33
69 blt mloop
70
71 ldp x29,x30,[sp],#16
72 ldp x27,x28,[sp],#16
73 ret
74
75 //----------------------------------------------------------------

Table 8.4: Performance of sine function with various
implementations.
Optimization Implementation CPU seconds
None 32-bit Fixed Point Assembly 5.15
32-bit Fixed Point C 18.60
Single Precision ﬂoating point C 10.56
Double Precision ﬂoating point C 10.07
-O2 32-bit Fixed Point Assembly 4.73
32-bit Fixed Point C 5.38
Single Precision ﬂoating point C 9.36
Double Precision ﬂoating point C 9.23

8.6.4 Performance comparison

In some situations it can be very advantageous to use ﬁxed point math. Many processors in-
tended for use in embedded systems do not have a hardware ﬂoating point unit available.
Table 8.4 shows the CPU time required for running a program to compute the sine function
on 100,000,000 random values, using various implementations of the sine function. In each
case, the program main() function was written in C. The only difference in the four imple-
mentations was the data type (which could be ﬁxed point, IEEE single precision, or IEEE
double precision), and the sine function that was used. The times shown in the table include
only the amount of CPU time actually used in the sine function, and do not include the time
required for program startup, storage allocation, random number generation, printing results,
or program exit. The four implementations are as follows:

287 288 289 290 291 292 293 294 295 296 297